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TITLE Oe[ the invention 

i 

A jPROCESSOR FOR EXECUTING INSTRUCTIONS IN UNITS THAT 
ARE UNRELATED TO THE UNITS IN WHICH INSTRUCTIC»TS ARE READ, 
AND A CQMPII£R, AN OPTIMIZATION APPARATUS, AN ASSEMBLER, A 

LINKER, A DEBUGGER \AND A DISASSEMBLER FOR SUCH PROCESSOR 

i 

i 

Thfs application is based on an application No, HIO- 

i 

118326 failed in Japan, the content of whir.b is hereby 
incorporated by reference. 

i 

BACKGROU]SfD OF THE INVENTION 

1 . Field! of the Invention 

\ ^ 

Th4 present invention relates to a processor for 
executing instructions in units that are unrelated to the 
units in| which instructions are read, and a compiler, an 
optimizaipion apparatus, an assembler, a linker, a debugger 
and a disassembler for such processor. 
j 

2, Description of the Prior Art 

i 

Prcjcessors conventionally read and execute 

J 

instruct4ons stored in memory according to a program 
counter. , Fig. 1 is a block diagram showing the basic 
constructjion of an example processor . 

Thei instruction memory 4301 stores four 8-bit 
instruct ijons as one instruction packet. 

The| program counter 4300 indicates the address of an 
instruction packet in the instruction memory 4301. 



Thp instruction reading unit 4302 reads the 
instructjion packet indicated by the program counter 4300 
from the! instruction memory 4301. 

i 

Thte instruction executing unit 4303 executes all four 
instructions included in the read instruction packet in one 

cycle. I 

i 

111 I this way, a conventional processor can read an 
instruction packet that is indTC?.ted by the pzoyxcuu counter 
and can execute four instructions in the instruction 

! 

packet, i 

i 

Th^ above processor has to execute all of the 

{ 

instruct jLons in the read instruction packet in one cycle . 
Accordingly, when one or more instructions in an 

i 

instruction packet cannot be executed due to problems with 

I 

computer j system resources such as memory or I/O, none of 
the inst:|:uctions in the instruction packet can be executed 
until su6h problems are resolved. This slows program 
executioft. 

1 

SUMMARY OF THE INVENTION 

i 

In jview of the stated problems, it is a primary 
object of the present invention to provide a processor that 
executes ! instructions in units that are unrelated to the 
units in jwhich instructions are read from a program and a 
program development environment for generating suitable 
programs 

i 

This primary object is achieved by a processor for 

1 



reading linstructions from a memory according to a program 
counter,; the memory storing instructions in one-byte units, 
and for iexecuting the read instructions, the program 
counter iincluding a first program counter and a second 
program counter, the first program counter indicating a 
storage position of a processing packet in the memory, the 

i 

processing packet being composed of an integer number of 
the one-j^yte units^ the second program counter i r>H-i. eating a 
position; of processing target instruction in the processing 
packet/ the processing target instruction being an 
operation to be executed by the processor. 

Wilph the stated construction, the first program 
counter ^ndicates a storage position in the memory of a 
processing packet whose size is an integer number of bytes. 
Reads from the memory are performed based on this first 
program >;.ounter. The second program counter can indicate 
any position of a processing target instruction included in 
the processing packet read from the memory. As a result, 
the instjjTuctioii (s) to be executed can be freely set 
regardless of the amount of data read in one read 
operatioi^. This means that instructions whose word length 
is not ai) integer number of bytes can be executed even when 
read operations from the memory to the processor are 
performed in units of an integer niimber of bytes. 

Here, the processor may include a first program 
counter vjpdating unit and a second program counter updating 
unit, th^ second program counter updating unit incrementing 



a value |of the second program counter in accordance with an 
amount o;f instructions that were executed in a preceding 
cycle and sending any carry generated in an incrementing to 
the first program counter updating unit, and the first 

i 

program pounter updating unit adding the carry received 
from the! second program counter updating unit to the value 
of the first program counter. 

With the stated construction, the v?-lue of the 
program counter is incremented by the amount of 
instruct p.ons that have just been executed/ so that the 
program Counter can be updated to indicate the first 

position| of the instructions to be executed in the next 

j 

cycle - 1 

Here, the processor may further include: a program 
counter relative value extracting unit for extracting, when 
an instrtiction being executed includes a program counter 
relative I value that is based on an address of a first 
instruction executed in a present cycle^ the program 
counter relative value; and a calculating unit for adding 
the prog^^am counter relative value to the value of the 
first program counter and the value of the second program 
counter, jand setting an addition result as the value of the 
first program counter and the value of the second program 
counter . : 

Whe*'* the processor executes a branch instruction, the 
value of jthe program counter is added to a program counter 

i 

relative lvalue that is a difference in addresses between 



the present branch instruction and the branch destination 
instructlion. The result of this addition is then set as 
the new ;value of the program counter to have the program 
counter ^indicate the branch destination instruction. 

Hejre, the calculating unit may include a first 
calculatiing unit and a second calculating unit, the second 
calculating unit adding the value of the second program 
counter and lower bits of the program counter relative 
value, setting a result of an addition as the value of the 
second program counter, and sending any carry generated in 
the addiction to the first calculating unit, and the first 

calculatling unit adding the value of the first program 

j 

counter, I upper bits of the program counter relative value, 
and any parry received from the second calculating unit, 
and settling a result of an addition as the value of the 
first program counter. 

When the processor executes a branch instruction and 
the program counter and a program counter relative value 

are adde<^, a carry generated when calculating the lower 

I 

bits is properly considered when calculating the upper 
bits. In this way, addresses can be calculated with proper 
continuiiLy between the calculation of the lower bits and 
the calculation of the upper bits. 

Here, the calculating unit may include a first 
calculating unit and a second calculating unit, the second 
calculating unit adding the value of the second program 
counter ajnd lower bits of the program counter relative 



value wijthout generating a carry, and setting a result of 
an additiion as the value of the second program counter, the 
first calculating unit adding the value of the first 
program ;Counter and upper bits of the program counter 

relative; value, and setting a result of an addition as the 

f 

value of the first program counter. 

When the processor executes a branch instruction, 
calculation of the lower bits of the value of the program 
counter |and the program counter relative value by the 

second calculating unit does not generate a carry to the 

j 

calculation of the upper bits of the value of the program 
counter ^nd the program counter relative value by the first 
calculating unit* As a result, the calculations of the 
first and second calculators can be performed independently 

of one another, so that a simplified hardware construction 

j 

can be used. 

He3;-e, the calculating unit may add the value of the 
first prpgram counter and upper bits of the program counter 
relative I value, sets a result of an addition as the value 
of the first program counter, and sets lower bits of the 
program cpounter relative value as the value of the second 
program counter . 

When the processor executes a branch instruction, no 
calculation using the value of the second program counter 

i 

and the iower bits of the program counter relative value is 
required^' so that the processor can execute branch 
instructOiOns at a higher speed. 



H^re, the calculating unit may add the program 
counter .relative value and a value whose upper bits are the 

i 

value the first program counter and lower bits are the 
value of the second program counter, and sets upper bits of 
a result of an addition as the value of the first program 
counter .and lower bits of the result as the second program 

counter .1 

i 

Wh^n the processor executes a branch -instruction, the 
calculation using the value of the program counter and the 
program .counter relative value can be performed by a 
standard| calculator. This means the hardware construction 
of the pirocessor can be simplified. 

He:^e, the processor may further include: a program 

counter relative value extracting unit for extracting, when 

i 

an executed instruction includes a program counter relative 

value that is based on an address of the executed 

1 

instruction^ the program counter relative value; a program 

) 

counter ^mending unit for amending the value of the first 
program <pounter and the value of the second program counter 
to indicate an address of the executed instruction; and a 

calculating unit for adding the program counter relative 

i 

value, tl|e value of the first program counter, and the 
value of the second program counter, and setting a result 
of an addition as the value of the first program counter 

and the value of the second program counter. 

i 

Thej program counter relative value is the difference 
in addresjses between a branch instruction and the branch 



destina-cioii instruction, so that it will not be necessary 
to change the prograia counter relative value even when 
there is a change in the boundaries marking which 

i 

instructions in Lhe program will be executed in parallel . 

Hejre, the processor may further include; a program 
counter relative value calculating instruction decoding 
unit fori decoding a program counter relative value 

i 

calculating instruction that performs an additiOii using a 
program jcounter relative value and one of (a) a value of 
the progjram counter stored in a register, and (b) the value 
of the first program counter and the value of the second 
program bounter; a calculating unit for performing the 

addition! indicated by the program counter relative value 

i 

calculat|-ng instruction to generate an addition result; and 
a prograip counter value updating unit for storing the 

addition) result in one of (a) the register, and (b) the 

i 

first program counter and the second program counter. 

With the stated construction, it is possible to use 
an instruction that indicates a calculation using the value 
of the program counter and a program counter relative value 
in place |of an instruction that stores the absolute address 

of a function into a register. A program counter relative 

i 

value ha^ a shorter bit width that the absolute address of 
an instrijction, so that the overall code size can be 
reduced. ■ When using PIC codes where the addresses of 

i ~ 

instructions in memory are only determined when the program 

i 

is executed, absolute addresses cannot be used, so that 

i 



! 

calculatjioji instructions that use the program counter and a 

program Icounter relative value are essential. 

i 

Hejre, the first program counter may indicate a memory 
address,! the memory address being a storage position in the 
memory of a processing packet that is given by bit shifting 
the valuje in the first program counter by log^n bits in a 
leftward direction, n being a length of a processing packet 
in !h\/t-e<a 

With the stated construction, while separate 
addressep are assigned to each one-byte storage packet in 

i 

the memory, the value of the first program counter 
corresponds with the address of a processing packet in the 
memory. | As a result, the processor can easily specify a 

processing packet in the memory, 

j 

He3^e, the processor may further include: an 
instruction buffer for temporarily storing instructions; 

i 

and an instruction reading unit for transferring 

i 

instructions with a minimiom transfer size of one one-byte 
unit from the memory to the instruction buffer, in 
accordance with available space in the instruction buffer 
but regaa^dless of a size of a processing packet . 

With the stated construction, the amount of data read 

i 

by the processor from the memory in one read operation can 
be freelij set, so that the construction in the processor 
for readijng instructions can be made highly flexible. 

The! stated primary object can also be achieved by an 

instruction sequence optimizing apparatus, for generating 

I 
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optimized code from an instruction sequence, including: an 
address iassigning unit for estimating a size of each 
instruction in the instruction sequence and assigning an 
address |to each instruction, upper bits of each address 
indicating a memory address at which a processing packet is 
stored ak:id lower bits of each address indicating a 
processing target instruction in the processing packet; a 
Icibel dejtecting unit (1) for detecting a label, which 
should bp resolved by an address of a specified 
instructjLon, from the instruction sequence, and obtaining 
the addrfess of the specified instruction, and (2) for 
detecting a label, which should be resolved by a difference 
in addresses of two specified instructions, from the 
instruction sequence, and obtaining the addresses of the 
two specified instructions; a program counter relative 
value calculating unit for calculating, when a label which 
should b^ resolved by a difference in addresses of two 
specifie<j? instructions has been detected, a program counter 
relative ; value by subtracting an address of one o£ the two 

specified instructions from an address of another of the 

I 

two specified instructions; a converting unit (1) for 
converti4g an instruction that has a label that should be 
resolved |by an address of a specified instruction into an 
instruction with a si2e that is based on a size of the 
address qf the specified instruction, (2) for converting an 
instructijon that has a label that shpuld be resolved by a 
differencje in addresses of two specified instructions into 

i 
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an instijuction with a size that is based on a size of the 
program [counter relative value calculated from the 
addresseis of the two specified instructions; and an 
optimized code generating unit for generating optimized 
code by ';converting addresses of instructions in accordance 
with the! sizes of instructions after conversion by the 
converting unit. 

The above construct-ion achieves cin optiirwization 
apparatus for generating programs for a processor that 
executes branch instructions. 

Hejre, the program counter relative value calculating 
unit mays include a lower bit subtracting unit and an upper 

bit subtracting unit, the lower bit subtracting unit 

I 

subtracting lower bits of the address of the one of the two 
specified instructions from lower bits of the address of 
the othej: of the two specified instructions, for setting a 

result of a subtraction as lower bits of the program 

i 

counter i-elative value, and sending any carry generated in 

i 

the subt^f action to the upper toit subtracting unit, and the 
upper bii subtracting unit subtracting upper bits of the 
address 6f one of the two specified instructions and any 
carry received from the lower bit subtracting unit from 
upper bit|s of the address of the other of the two specified 
instructions, and for setting a result of a subtraction as 
upper bitjs of the program counLer relative value - 

The! above construction achieves an optimization 

j 

apparatus; for generating programs for a processor which, 

\ 

i 
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when executing a branch instruction/ calculates the address 
of a brajnch destination instruction using a carry method. 

Here, the program counter relative value calculating 
unit may include a lower bit subtracting unit and an upper 
bit siibtjracting unit, the lower bit subtracting unit 
subtractjing lower bits of the address of one of the two 
specified instructions from lower bits of the address of 
the otheir of the two specified instructions without 
generating a carry and setting a result of a subtraction as 
lower bijts of the program counter relative value, and the 

upper bit subtracting unit subtracting upper bits of the 

I 

address bf one of the two specified instructions from upper 

bits of the address of the other of the two specified 

i 

instruct jLons, and for setting a result of a subtraction as 

i 

upper bij^s of the program counter relative value. 

Th^ above construction achieves an optimization 
apparatus for generating programs for a processor which, 
when executing a branch instruction, calculates the address 
of a branch destination instruction without using a carry. 

Here, the program counter relative value calculating 
unit may 1 subtract upper bits of an address of one of the 

i 

two specified instructions from upper bits of an address of 
the otheif of the two specified instructions, set a result 
of a subtraction as upper bits of the program counter 
relative j value, and set lower bits of the other of the two 
specified instructions as lower bits of the program counter 
relative lvalue. 

I 

i 12 



This above construction achieves an optimization 

i 

r 

apparatus for generating programs for a processor which, 
when exe'cuting a branch instruction, calculates the address 
of a brajnch destination instruction using an absolute 
value - 

Thfe stated primary object can also be achieved by an 

i 

assembled that generates relocatable code from an 
instruction sequence, each address of an instruction in the 
instruct^ion sequence having upper bits that indicate a 
memory aiddress at which a processing packet is stored and 
lower bits that indicate a position of processing target 
instruction that is included in the processing packet, the 
assemble^ including: a label detecting unit for detecting a 
label ini the instruction sequence that should be resolved 
by a difference in addresses between two specified 
instructions, and obtaining the addresses of the two 
specif ietji instructions; a program counter relative value 
calculating unit for calculating a program counter relative 
value by I subtracting an address of one of the two specified 
instructions from an address of another of the two 
specified instructions; and a replacing unit for replacing 
the labe| with the program counter relative value 
calculated by the program counter relative value 

calculating unit. 

i 

The! above construction achieves an assembler for 
generating programs for a processor that executes branch 
instructijons . 



t 



Hejre, the program counter relative value calculating 
unit may include a lower bit subtracting unit and an upper 
bit subtjr acting unit, the lower bit sxobtracting unit 
subtracting lower bits of the address of the one of the two 
specifie:d instructions from lower bits of the address of 

the othsir of the two specified instructions, for setting a 

i 

result df a subtraction as lower bits of the program 
counter relative value, and sending any carry genera Ltsd iii 
the siibt'jraction to the upper bit subtracting unit, and the 
upper bi|t subtracting unit subtracting upper bits of the 

address bf one of the two specified instructions and any 

i 

carry received from the lower bit subtracting unit from 
upper bip of the address of the other of the two specified 
instructjlons, and for setting a result of a subtraction as 
upper bits of the program counter relative value. 

i 

Th^ above construction achieves an assembler for 
generating programs for a processor which, when executing a 
branch instruction, calculates the address of a branch 

destination instruction using a carry method. 

i 

Hefe, the program counter relative value calculating 
unit may (include a lower bit subtracting unit and an upper 
bit subtracting unit, the lower bit subtracting unit 
subtracting lower bits of the address of one of the two 
specified instructions from lower bits of the address of 
the otheij of the two specified instructions without 
generating a carry and setting a result of a s\ibtraction as 
lower bitjs of the program counter relative value, and the 
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upper bit subtracting unit subtracting upper bits of the 
address one of the two specified instructions from upper 
bits of Ithe address of the other of the two specified 
insLructaons, and for setting a result of a subtraction as 
upper bits of the program counter relative value. 

Thk above construction achieves an assembler for 
generatihg programs for a processor which, when executing a 
branch instruction, calculates the address of a branch 
destination instruction without using a carry, 

He:|:e, the program counter relative value calculating 

unit may^ subtract upper bits of an address of one of the 

i 

two specified instructions from upper bits of an address of 

I 

the othet of the two specified instructions, set a result 

I 

of a subtraction as upper bits of the program counter 
relative; value, and set lower bits of the other of the two 
specifie<h instructions as lower bits of the program counter 
relative i value . 

Th^ above construction achieves an optimisation 
apparatu^ for generating programs for a processor which, 
when executing a branch instruction, calculates the address 
of a branch destination instruction using an absolute 
value . 

I 

The stated primary object can also be achieved by a 
linker tfjat generates object code by combining relocatable 

code, each address of an instruction in the relocatable 

i 

code havijng upper bits that indicate a memory address at 
which a processing packet is stored and lower bits that 
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indicatd a position of processing target instruction that 
is inclJ;ded in the processing packet, the linker including: 
a relocation information detecting unit for detecting a 
label iri the relocatable code that should be resolved by a 
difference in addresses between two specified instructions, 
and obtajining the addresses of the two specified 
instructions; a program counter relative value calculating 
unit for! calculating a prcgrar. cour.tsr relative value by 
subtracting an address of one of the two specified 
instructp-ons from an address of another of the two 
specif ieli instructions; and a replacing unit for replacing 
the labejL with the program counter relative value 
calculated by the program counter relative value 
calculating unit. 

i 

Th^ above construction achieves a linker for 
generating programs for a processor that executes branch 
instructions- 

Here, the program counter relative value calculating 
unit may i include a lower bit subtracting unit and an upper 
bit subtracting unit, the lower bit sxibtracting unit 
subtracting lower bits of the address of the one of the two 

specified instructions from lower bits of the address of 

I 

the othet of the two specified instructions, for setting a 
result of a subtraction as lower bits of the program 
counter ^jelative value, and sending any carry generated in 
the subtifaction to the upper bit subtracting unit, and the 
upper h±i subtracting unit subtracting upper bits of the 

i 16 
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address jof one of the two specified instructions and any 
carry received from the lower bit subtracting unit from 
upper bits of the address of the other of the two specified 
instructions, and for setting a result of a subtraction as 

i 

upper bijts of the program counter relative value* 
Th^ above construction achieves a linker for 

generating programs for a processor which, when executing a 

i 

branch instruction., calr.iiiat^^ the address cf a br^inch 
destination instruction using a carry method. 

Here, the program counter relative value calculating 
unit may! include a lower bit siibtracting unit and an upper 
bit subtracting unit, the lower bit subtracting unit 
subtractjing lower bits of the address of one of the two 
specifietl instructions from lower bits of the address of 
the other of the two specified instructions without 
generatipg a carry and setting a result of a subtraction as 
lower bits of the program counter relative value, and the 
upper bit subtracting unit subtracting upper bits of the 
address 6f one of the two specified instructions from upper 
bits of the address of the other of the two specified 
instructions, and for setting a result of a subtraction as 

i 

upper bi-jis of the program counter relative value. 

ThQ above construction achieves a linker for 
generating programs for a processor which, when executing a 
branch instruction, calculates the address of a branch 
destination instruction without using a carry. 

Hexe, the program counter relative value calculating 

I 
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unit mayi subtract upper bits of an address of one of the 
two specjified instructions from upper bits of an address of 
the other of the two specified instructions, set a result 

i 

of a subtraction as upper bits q£ the program counter 
relative; value, and set lower bits of the other of the two 

specifieid instructions as lower bits of the program counter 

I 

relative| value. 

Th^ above construction achieves a linker for 
generating programs for a processor which, when executing a 
branch instruction, calculates the address of a branch 
destinatjLon instruction using an absolute value. 

The stated primary object can also be achieved by a 

i 

disassemiDler that receives an indication of an address of 

! 

an instruction in object code and outputs an assembler name 
of the ihstruction at the indicated address, each address 
of an instruction in the object code having upper bits that 
indicate. a memory address at which a processing packet is 
stored and lower bits that indicate a position of 
processipg target instruction that is included in the 
processijig packet, the disassembler including: a program 
counter Relative value extracting unit for extracting, when 
the indicated instruction includes a program counter 
relative i value, the program counter relative value from the 
indicate<ji instruction; a label addressing calculating unit 
for addiijig an address of the indicated instruction to the 
extracted program counter relative value and setting an 
addition I result as a label address; a storing unit for 

! 

! 

I 18 



storing a label name corresponding to each label address; 
and a se^arching unit for searching the storing unit for a 

I 

label name that corresponds to the calculated label address 
and outpiutting Lhe corresponding label name. 

The stated construction can disassemble a program 
that includes a branch instruction. When the disassembled 

instructjion is a branch instruction, the address of the 

j 

branch djestination instruction can bp c?.lculated frcir. the 
program counter relative value. This address is then used 
to search the label table and so obtain the label name. As 
a result> the branch destination can be displayed to the 
user in the readily understandable form of a label name, 
even wheji program counter relative values are used in 
branch instructions . 

Heie, the label address calculating unit may include 

r 

a lower l^it calculating unit and an upper bit calculating 

i 

unit, the lower bit calculating unit for adding lower bits 

of the address of the indicated instruction and lower bits 

t 

of the program counter relative value, setting a result of 
an addition as lower bits of a label address, and sending 
any carr^ generated by the addition to the upper bit 
calculating unit, and the upper bit calculating unit adding 
upper bits of the address of the indicated instruction, 
upper bits of the program counter relative value, and any 
carry rec^eived from the lower bit calculating unit, and 
setting ^ result of the an addition as upper bits of the 
label adcjress, 

i 



The above construction achieves a disassembler that 
can disa[sseinble programs for a processor which, when 
executihg a branch instruction^ calculates an address of a 
branch destination instruction using a carry, 

Hejre, the label address calculating unit may include 
a lower |bit calculating unit and an upper bit calculating 
unit, the lower bit calculating unit adding lower bits of 
thP. address of the indicated inGtruction and lower bits of 

i 

the program counter relative value without generating a 
carry, apd setting a result of an addition as lower bits of 
a label ^ddress, and the upper bit calculating unit adding 

upper bits of the address of the indicated instruction and 

I 

upper bijts of the program counter relative value, and 

i 

setting ^ result of an addition as upper bits of the label 

address . | 

t 

The above construction achieves a disassembler that 
can disassemble programs for a processor which, when 
executing a branch instruction, calculates an address of a 
branch destination instruction without using a carry . 

Here, the label address calculating unit may add 
upper bits of the address of the indicated instruction and 
upper blips of the program counter relative value, set a 
result o:^ an addition as upper bits of the label address, 
and set ;j.ower bits of the program counter relative value as 

lower bilj:s of the label address. 

i 

Th^ above construction achieves a disassembler that 
can disassemble programs for a processor which, when 
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executirug a branch instruction, calculates an address of a 

i 

branch (destination instruction using an absolute value. 

The stated primary object can also be achieved by a 
debugger that receives an indication of an address of an 
instruction in object code and replaces the instruction at 
the indiicated address with a replacement instruction^ each 
address jof an instruction in the object code having upper 
bits th3:t :Lndicatc a laenicry addrcao at 'whie;li <i processing 
packet is stored and lower bits that indicate a position of 

processing target instruction that is included in the 

i 

processing packet, the debugger including: a processing 
packet reading unit for reading a processing packet that is 
indicated by upper bits of the indicated address from the 

memory and writing the processing packet into an 

1 

instruction buffer; an instruction writing unit for writing 
the replacement instruction into the processing packet in 

the instruction buffer over an instruction that is 

i 

indicated by the lower bits of the indicated address; and a 

proccssi][ig packet writing uiiiL for writing rhe processing 

1 

packet ii|i the instruction buffer back into the memory after 
the replacement instruction has been written. 

Th0 above construction reads instructions in units of 

i 

processing packets from a memory that stores instructions 
in one-byte storage packets, rewrites instructions in an 
instruction buffer, and writes instructions back into the 
memory ir^ units of processing packets. This achieves a 
debugger jthat can debug instructions whose length is not an 
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integer number of bytes, 

Th^ stated primary object can also be achieved by a 
compiler that generates an instruction sequence from source 
code, thje compiler generating a program counter relative 
value cciiculating instruction that is executed by a 

processor, the program counter relative value calculating 

i 

instructjion being an instruction that performs a 
calculation using a first value and a prcgroiTi counter 
relative^ value and uses a result of the calculation to 
update the first value, the first value being one of (a) a 
value of j a program counter stored in a register, and (b) 
the valup stored in a program counter of the processor, 
wherein upper bits of the first value indicate a memory 
address kt which a processing packet is stored, and lower 
bits of the first value of the program counter indicate a 
processing target instruction that is included in the 
processing packet. 

The above construction achieves a compiler that 
generate^ programs for a processor that executes program 
counter Relative value calculating instructions. 

Here, the processor may include a lower bit 
calculating unit and an upper bit calculating unit, the 
program <|:ounter relative value calculating instruction 

having tY^e lower bit calculating unit perform a lower bit 

j 

calculatj|on and the upper bit calculating unit perform an 
upper bitj calculation, the lower bit calculation being an 
addition jusing lower bits of the first value and lower bits 
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of the vjalue of the program counter relative value, where a 
result o'f the lower bit calculation is set as the lower 
bits of 'the first value and any generated carry is sent to 
the upper bit calculating unit, and the upper bit 
calculation being an addition using upper bits of the first 

value, ufe)per bits of the value of the program counter 

j 

relative! value and any carry received from the lower bit 

1 p.n 1 ?^ ino" u^it/ T'^here a result of the upper bit 
calculation is set as the upper bits of the first value • 

Th^ above construction achieves a compiler that 
generates a program for a processor which, when executing a 
program pounter relative value calculating instruction, 

performs I a calculation using a value of the program counter 

I 
I 

and the program counter relative value according to a carry 
method, i 

Here, the processor may include a lower bit 

1 

calculating unit and an upper bit calculating unit, the 
program pounter relative value calculating instruction 

having t^B lower bit calculating unit perform a lower bit 

I 

calculation and the upper bit calculating unit perfoi^ an 
upper bil; calculation, the lower bit calculation being an 

i 

addition j using lower bits of the first value and lower bits 
of the value of the program counter relative value that 
does not (generate a carry, where a result of the lower bit 

i 

calculation is set as the lower bits of the first value, 
and the upper bit calculation being a calculation using 
upper bit|s of the first value and upper bits of the value 



of the jjrogram counter relative value, where a result of 
the uppejr bit calculation is set as the upper bits of the 
first value - 

The above construction achieves a compiler that 
generates a program for a processor which, when executing a 
program {Counter relative value calculating instruction, 

performs| a calculation using a value of the program counter 

i 

and the jprogram counter relative value without gener-^t-inrr ^ 
carry. , 

Hefe, the processor laay includes an upper bit 
calculating unit, the program counter relative value 
calculating instruction having the upper bit calculating 
unit perform an upper bit calculation and setting lower 

bits of ^he program counter relative value as lower bits of 

i 

the first value, and the upper bit calculation being an 
addition! using upper bits of the first value and upper bits 
of the value of the program counter relative value, where a 
result of the upper bit calculation is set as the upper 
bits of j::he first value, 

Th4 above construction achieves a compiler that 
generate^ a program for a processor which, when executing a 
program (pounter relative value calculating instruction, 
performs j a calculation using a value of the program counter 
and the program counter relative value according to an 

absolute ; value calculating method. 

j 

i 
i 

i 
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BRIEF DESCRIPTION OF THE DRAWINGS 

i 

Thbse and other objects, advantages and features of 

the inveintion will become apparent from the following 

t 

descripdion thereof taken in conjunction with the 

accompanying drawings which illustrate a specific 

! 

embodiment of the in vent ion ♦ In the drawings: 

Fi^- 1 is a block diagram showing the construction of 

a conventional prnnp.?i;=ior ; 

Fig. 2A shows the format of one instruction executed 

by the pjcocessor of the first embodiment of the present 

inventio|i; 

Fig. 2b shows the format of another instruction 

t 

executed! by the processor of the first embodiment of the 
present jlnvention; 

Fi^. 2C shows the format of another instruction 
executed! by the processor of the first embodiment of the 
present invention; 

Fig. 2D shows the format of another instruction 

I 

! 

executed 1 by the processor of the first embodiment of the 

i 

present invention; 

Fig. 2E shows the format of another instruction 

! 

executed 'by the processor of the first embodiment of the 

present invention; 

i 

Fig. 3A shows an instruction packet that is the unit 

used for j storing and reading instructions in this first 

i 

embodimerit; 

i 

Figj. 3B shows the read order of instructions; 
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Fig. 3C shows the execution order of instructions; 

I 

Fi^. 4 shows an example of the methods used by a 
conventional processor to store and read instructions that 
are not byte-aligned; 
5 Fij, 5 shows the procedure by which the object code 

to be executed by the processor is generated by a compiler, 

1 

optimizajtion apparatus, assembler, and linker; 

■ 

Fi^. 6 is a block diaaram showing tb** Hetaiis of the 
processor 309 and the external memory; 

10 Fig. 7 is an increment table showing the rules used 

! 

U to increment the in-packet address; 

p Fi^. 8A is an addition table showing the addition 

rules us^d when adding the lower 3 bits of the address of a 
^ branch ipstruction to lower 3 bits of the PC relative 
=p5 value; j 

U Fig. 8B is a subtraction table showing the 

ji: subtraction rules used when subtracting the lower 3 bits of 

the PC relative value from the lower 3 bits of a branch 
jQ destination address; 

20 Fig, 9 is a block diagram showing the components and 

input/ou|:put data of the optimization apparatus 303; 
Fi^- 10 is a flowchart showing the operation 



procedur^ of the optimization apparatus; 

Fig, 11 shows part of the optimization processing 
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generated by the code optimization apparatus 902; 

Fig. 12 shows the address assigned codes 916 
1 

generate(k from the optimization processing code 903 shown 
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in Fig. 11; 

Fi^. 13 shows the label information 906 generated 
from the address assigned codes 916 shown in Fig- 12; 

Fi^. 14 shows the optimized code 304 generated from 
5 the addr^ess assigned codes 916 shown in Fig. 12; 

Fig. 15 is a block diagram that shows the 

i 

construction of the assembler 305 shown in Fig. 5 and the 
input /output data relatea to the assembler 505; 

Fig- 16 is a flowchart showing the operation of the 
10 assembler; 

y Fi^. 17 shows the machine language codes 803 that are 

W generated from the optimized code 304 shown in Fig. 14; 

Pi^* 18 shows the label information that is generated 
^ from thel machine language codes shown in Fig- 17; 

^15 Fig, 19 shows the relocatable codes that are 

L ' 

p generate<p[ from the machine language codes 803 shown in Fig. 

ft 17; 1 

! 

Fi^. 20 is a block diagram showing the construction 
of the linker 307 and the I/O (input/output) data of tne 
20 linker 397; 

Fi^. 21 is a flowchart showing the operation of the 
linker 307; 

Fig. 22 show5 the relocatable codes; 

t 

Fiqj. 23 shows the state when the relocatable codes 
814 showrj in Fig. 19 have been combined with the 
relocatable code shown in Fig. 22; 

Fig;. 24 shows the resulting combined codes 7 03; 
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Fig. 25 shows the label information that is generated 

from thei combined codes 703 shown in Fig. 24; 

I 

Fig. 2 6 shows the object codes generated from the 
combined, codes 703 shown in Fig. 24; 

i 

5 Fi^. 27 shows the object code generated by the second 

! 

embodiment of the present invention; 

I 

Fig. 28A shows the construction of an instruction 
packet ih the third embodiment:; 

Fi^. 28B shows the types of instructions used in the 
10 third embodiment; 

Fi^. 28C shows the relation between in-packet 
y addresse^ and the instruction units in a packet; 
2 Fi^. 2 9A is an addition table showing the addition 

;^ rules fof adding the lower 3 bits of the address of the 
Mb branch instruction and the lower 3 bits of the PC relative 
□ value in! the calculation method of the fourth embodiment 

X, that doe:^ not use a carry; 

IS i 

Fi^, 29B is a subtraction table showing the 
ip subtract|on rules for subtracting the lower 3 bits of the 

20 address cjjf the branch instruction from the lower 3 bits of 
the address of the branch destination instruction in the 
calculation method of the fourth embodimenL that does not 

use a calory; 

t 

! 

Figj. 30 shows the object code that is generated by 
25 the address calculation method of the fourth embodiment 
that does not use a carry; 

Figj. 31A is an addition table showing the addition 

i 

i 
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rules fojr adding the lower 3 bits of the address of the 
branch instruction and the lower 3 bits of the PC relative 
value in! the calculation method of the fifth Gmbodiment 
that usels absolute values; 

Fig. 31B is a subtraction table showing the 

sxibtraction rules for subtracting the lower 3 bits of the 

I 

address of the branch instruction from the lower 3 bits of 
the address ot the branch destination instruction in th^ 
calculation method of the fifth embodiment that uses 

absolute = values; 

! 

Ficr. 32 shows the object code that is generated by 

the above address calculation method of the fifth 

i 

embodimeijit that uses absolute values; 

j 

Fi^, 33 shows the object code that has been generated 

using th^ linear calculation method of the sixth 

I 

embodiment; 

Fig. 34 shows the processor of the seventh 
embodimer^t; 

Fi^- 35A shows the operation that corresponds to a PC 
adding instruction which is shown in mnemonic form; 

Figj. 35B shows the operation that corresponds to a PC 
subtracting instruction which is shown in mnemonic form; 

Fig!, 36 shows the construction of the compiler of the 
eighth embodiment of the present invention; 

FigL 37 is a flowchart showing the operation of the 
compiler; 

Fig> 38 shows source code which is written in C 
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language!/ 

Fi^. 39 shows the intermediate codes that have been 
generated from the source program shovm in Fig. 38; 

Fig- 40 shows the assembler code that has been 
produced by converting the intermediate codes shown in Fig. 
39; 

i 

Fi^. 41 is a block diagram showing the construction 
of the debugger and dir^cLsseimblor of the present einbocJime^nt : 

Fig. 42 is a flowchart showing the operating 
procedure of a disassembler of the present invention; and 

Fi^- 43 is a flowchart showing the operation of the 
debugger 1 of the present invention . 

s 

1 

i 
i 

i 

DESCRIPTfcON OF THE PREFERRED EMBODIMENTS 

I 

Th^ following is a detailed description of several 

I 

embodiments of the present invention, with reference to the 
accompanying drawings . 

i 
i 

First E mbod iment 

! 

This first embodiment relates to an optimization 

i 

apparatu^, an assembler , and a linker that generate 
programs \ where read operations and execute operations have 
different units, and to a processor for executing such 
programs 1 

i 

i 



t 

Formats pf the Instructions Executed by the Processor 



Th^ following explains the formats of the 
instructions executed by the processor of this first 
embodimeht. These formats are shown in Figs. 2A 2E. The 
5 instructions executed by the present processor are 

constructed so that 21 bits is set as one instruction unit. 
For the present processor/ there are both one-unit (i.e./ 

e^^ ~-XJ^K-f ».xxvL (-inrv' v \ -i- . v;: » / ^ a^-*- *- / -i- j.^.^ triw v-i^ J- . 

The format information 101 is written as one bit and 
10 shows thjs length of each instruction. When the format 
S information 101 is "0", this shows that the unit including 

y this format information 101 forms one complete instruction, 
which isl to say, a 21-bit instruction* When the format 
informatjlon 101 is "1", this shows that the unit including 
•Fl5 this format information 101 and the following unit together 
□ form one two-unit instruction, which is to say, a 42-bit 

S instruction. 

S I 

^ff Th^ parallel execution boundary information 100 is 

fp I 

iO also wrijtten as one bit and shows whether a parallel 

20 exGcutiop boundary exists between the instruction formed by 
the presjent unit and the following instruction. When the 
parallel! execution boundary information 100 is "l"y this 
Shows that a parallel execution boundary exists between the 

i 

instruction including this parallel execution boundary 
25 information 100 and the following instruction, so that 

these in'structions will be executed in different cycles* 
When the! parallel execution boundary information 100 is 
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"0", thiis shows that no parallel execution boundary exists 

i 

between the instruction including this parallel execution 
boundary! information 100 and the following instruction^ so 
that these instructions will be executed the same cycle - 

i 

5 The remaining bits in each instruction are used to 

show an pperation. This means that 19 bits can be used to 
indicate.- the operation in a 21-bit instruction and that 40 
bits can! be used to indicate the operation in a 42-bit 

instruction • The fields marked "Opl", "Op2", "Op3", and 

1 

10 "Op4'' ar^ used to store opcodes that indicate the type of 

y operatioil to be performed. The field marked "Rs" is used 

! 

to store, the register number of a register used as the 
U source operand and the field marked "Rd" is used to store 

iJ the register number of a register used as the destination 

'^15 operand, - The fields marked "imm5" and "imm32" are 
!□ respectiyely used to store 5-bit and 32-bit immediates that 

III are used, in calculations. Finally, the fields marked 

ill 

%, "displB" and ♦'disp32" are respectively used to store 13-bit 

•M and 32-bit displacements. 

20 Transfer instructions and arithmetic instructions 

that hancjile long (such as 32-bit) constants and branch 
instructojons that use large displacements are defined as 

42-bit irjstructions . Most other instructions are defined 

I 

as 21-bit instructions. Of the two units used to compose a 

! 

42-bit ir^istruction, the latter unit is only used to store 
part of tjhe long constant or displacement, and so does not 
store thej opcode of the instruction. 
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Reading and Execution of Instructions by the Processor 



The following explains the operation of the present 
processof when reading and executing instructions. Note 
that thej processor of the present embodiment has a premise 
that staf:ic parallel scheduling is used. Fig. 3A shows an 
instruction packet that is the unit used for storing and 
reading instructions. Each instruction packet is composed 

v-.w t^j.j._«_ \ jwy^woy \AkAi.iiLiiy twLCi v..ca. \M 

bit) . In each cycle, the processor reads instructions 
using this fixed 64-bit packet length. Packets of this 

size are; used because the 21-bit unit size of instruction 

i 

is not sTjiited to reading from memory. Accordingly, a 

I 

number of such instructions are read together with dummy 

j 

data to xtiake the total packet size equal to an integer 
number o:^ bytes. In this example^ since the number of 
instruction units in each instruction packet is not a power 
of two, l|:here is the following special effect. This effect 
overcome^ the problems that occur when positions of the 
units inside instruction packets are expressed using 
binary. |ln the following explanation, the three units in 
an instruction packet are called the first/ second and 

i 

third uno^ts in order starting from the unit with the lowest 
address value. 

Fig. 3b shows the read order of instructions. As 
shown in jthe figure, one instruction packet is read in each 
cycle, I 

Fig'. 3C shows the execution order of instructions- In 
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each cydle, instructions are executed as far as the next 



parallel 



execution boundary. This means that the 



instructions are executed up to and including an 

I 

instruction whose parallel execution boundary information 
5 lOQ is "|l". Instruction units that are read but not 

executed! are accumulated in the instruction buffer, and are 

i 

executed! in a later cycle. 

i 

As j descx j-bsd- abov^^ the piroc^s sor of th^ pr^ssnt 
embodiments reads instructions using packets of a fixed 
io length, but only executes a suitable number of units in 

□ each cycle depending on parallelism of the instructions . 

■.n \ 

y The reasbn that the present processor can start the 

Q executioji of instructions in one cycle at any of the 

:^ instruction units in an instruction packet is that an in- 

B5 packet a^idress specifies an instruction unit in an 

p instruction packet. This is described in more detail 

ill later. ; 

Fi^. 4 shows an example of the methods used by a 
conventional processor to store and read instructions that 

20 are not byte-aligned. When 21-blt instructions that are 

i 

not byteyaligned are to be read in byte-units, three unused 
bits hav^ to be added to the end of each instruction to 
make the j instruction length 24-bits. This means that what 
are essentially 21-bit instructions are stored into and 
read frorji memory in 24-bit units. The length of three of 
such insliructions is 72 bits, so that the storage of three 

instructjjons in a 64 -bit packet in the present embodiment 

j 
i 
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reduces t)verall program size, 

Noie that while the present embodiment describes the 
packet construction when 2l-bit instructions are used, the 
invention is not limited to this instruction length. It is 
equally possible to construct instruction packets of 
instructions of a different length and to read the 
instructions using such instruction packets. As one 
example, i when instructions are n-bits long, values of m and 
r may be* selected so as to give a maximum value of 
n*m-i- (n^m^r ) subject to (n*m+r )mod8=0 . One packet is then 
composed: of m instruction units (each being n bits long) 

and r-bit dummy data. By doing so, instruction packets can 

] 

be composed of multiple-byte size using relatively little 

dummy datia. 

i 
i 

Method f j^r Expressing Instruction Addresses 

Th4 following explains the method used to express 
instruction addresses in the present embodiment. Here, an 

instructfon address refers to the address used Lo specify 

j 

the position of a unit and is expressed as 32 bits, 

i 

The upper 2 9-bit s of a 32-bit address are used to 

i 

Specify ^n instruction packet and so are called the "packet 
address"; This packet address is expressed as a 29-bit 
hexadecii|ial figure in a format such as "29'h01234567" . A 
value produced by shifting the value of this packet address 
by 3-bit |r to the left is the memory address at which the 
instruction packet is stored. 

! 

t 



Th^ lower 3-bit3 in a 32-bit address are used to 
Specify 'an instruction unit included in the instruction 
packet c^nd so are called the "in-packet address". This in- 
packet a^ddress is expressed as a 3-bit binary value in a 
5 format such as "3'bOOl". As examples, the in-packet 

address i"3*b001" specifies the first unit in an instruction 
packet, rthe in-packet address "3'bOlO" specifies the second 
unit and the In-p^cket address "3*bl00" specif i^-^ the 

third uriit- However, the in-packet addresses are not 

i 

10 limited (to these specific values. Other values may be used 
0 provided that the instruction units in an instruction 

ly packet ^re each specified using their own value. 

I"^ Thie indicating of addresses in this embodiment is 

If. I 

!^ such thsit only 3 bits are assigned for eight-bytes of 

=Cl5 instructjions • This gives the same results as when a 

p conventi^onal processor assigns a separate address to each 

p byte, since the upper 29-bits of addresses assigned to 
eight-b^rtes of instructions will be the same. 

20 Method for Generating the Object Code Executed by the 
Processpr 

T 

The following explains the method for generating the 

i 

object gode that is executed by the processor of the 
present jembodiment. 
25 First, the terminology to be used in this explanation 

is defined. 

I 

A j"PC relative value" is the difference between the 

\ 
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addressep of two instructions. 

A llabel" is either an "instruction address-resolved 

i 

label*' or a "PC relative value-resolved label". Absolute 

i 

address-resolved labels are replaced with absolute 
5 addresses of instructions during the processing that 

converts; a program into object code. An example of such a 
label is; the label "L2" in the transfer instruction "mov 
L2,rl" that transfers an instruction stored in mRTnory m 

the register rl- PC relative value-resolved labels are 

j 

10 replaced! with PC relative values during the processing that 

i 

converts; a program into object code. An example of such a 

|jj label is I the label "Ll** in the unconditional branch 

72 instruct|-on "bra LI" that performs an unconditional branch 

L= 1 

using the PC relative value. "Local labels" and "external 

pl5 labels" ^Iso exist as other types of label. When a label 

p: and the instruction including the label are included in the 

jrj same modtle (a module being a subprogram composed of an 

instruction sequence achieving one processing function) , 

10 such lat>^l is called a local label, while when the label 

20 and instj-uction including the label are included in 

different modules, such label is called an external label. 

Fig. 5 shows the procedure by which the object code 

to be executed by the processor is generated by a compiler, 

optimiza|ion apparatus, assembler, and linker. An overview 

25 of the ftinctions of these components is given below. 

Th^ compiler 301 analyzes the content of the source 

code 300 1 that is written in a high-level language like C 

! 
I 

\ 37 



and outpjits assembler code 302, 

Th^ optimization apparatus 303 assigns temporary 
addresses to the assembler code 302, links the instruction 
sequences in groups of three instruction units, and outputs 
optiraizeil code 304 as the linked results. In this process, 
local labels are calculated as PC relative values or 
instructfon addresses. The instruction size, which is to 
-.^ A ^■t-^^'^r^i- r\r> pVir^niH Ha *=*y:r»resfied as a one— 

max ^ f miiAVi: u-li^^ ^AAi^ w— . — - - - - A. 

unit instruction or as a two-unit instruction, is then 
determined based on the value of the PC relative value or 
the instruction address. 

Th^ assembler 305 outputs relocatable codes 306 which 
it generates from the optimized code 304- This processing 

converts! local labels that should be resolved with PC 

i 

relative; values into PC relative values . 

Th^ linker 307 combines a plurality of modules. That 
is, the iinker 307 combines a plurality of relocatable 
codes 30^ and outputs the resulting object code 308. In 

this processing, unresolved labels are converted into PC 

1 

relative 1 values or instruction addresses. 

Th^ processor 309 executes the object code 308. 

As I described above, a program written in a high-level 
languagejis converted by the compiler 301, the optimization 
apparatus 303, the assembler 305, and the linker 307 into 
object c<5de that is in a format executable by the 
processor • Each label in the program is converted into a 
PC relative value or an instruction address by one of the 
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steps in! the above procedure. Address resolution for local 
labels that should be resolved by a PC relative value is 
performe'^ by the assembler 305* Address resolution for 
local lajbels that should be resolved by an instruction 
address jand address resolution for external labels are 
performeld by the linker 307. 

Th^ following describes the construction and 

I 

operation of the processor 309/ the linker 307, the 

I 

assemble^: 305, and the optimization apparatus 303 shown in 
Fig. 4. ; 

I 

Processoir 



Fig. 6 is a block diagram showing the details of the 
processcrr 309 and the external memory . 

The processor 309 is capable of executing a maximum 

of threei instructions in parallel. This processor 309 

! 

includes! calculators 401a 401c, general registers 402, an 
upper PC 403, a lower PC 404, an upper PC calculator 411, a 

lower PC calculator 405, an INC 412, an instruc-bion buffer 

4 08, an prefetch upper counter 410, a prefetch lower 

I 

counter 413, instruction decoder 409a - 409c, a PC relative 

i 

value selector 420, an immediate selector 421, an operand 
data bufj^er 423, and an operand address buffer 422. The 

external; memory includes the data memory 406 and the 

j 

instruction memory 4 07. 

In [the following explanation, the upper PC 403 and 

I 

the lowe?: PC 404 will be collectively referred to as the 

i 

i 
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"PC", anii the upper PC calculator 411 and the lower PC 
calculatU 4 05 will be collectively referred to as the "PC 
calculator" . 

Th4 first calculator 401a, the second calculator 
401b, anb the third calculator 401c each perform one 

i 

calculatlion. These calculators are capable of calculating 
at the same time. 

The general registers 402 store data, addre.s«^s and 

other data- 

Thfe upper PC 403 stores the upper 2 9 bits of the 

1 

address bf the first instruction in a set of instructions 
to be executed in the next cycle, which is to say, a packet 
address. 

Thp lower PC 404 stores the lower 3 bits of the 

i 

address jof the first instruction in a set of instructions 
to be ex;ecuted in the next cycle, which is to say, an in- 
packet a|ddress. 

Thb instruction memory 407 stores instructions that 

t 

are expressed by the object code 308. 

Thje instruction buffer 408 stores instructions that 
have been read from the instruction memory 407. 

Thp first instruction decoder 4 09a, the second 
instructiion decoder 409b, and third instruction decoder 
4 09c dedode instructions and, if the respective 

i 
i 

instructions are executable, give indications to other 
componer|ts in the processor to have the instructions 
executec|. The first instruction decoder 409a receives an 
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input of I the first instruction stored in the instruction 
buffer 4b8, the second instruction decoder 409b an input of 
the next! instruction, and the third instruction decoder 
4 09c an ^Lnput of a next instruction. These instruction 
decoders; 409a - 409c investigate whether there is a 
parallel! execution boundary between the instruction units 
and only! have the instructions that should be executed in 
the presjz-nt cycle executed. As one exaiuple, when an 
instruction performs a calculation using a constant, the 
constant| is sent to the first calculator 401a via the 
immediate selector 421 and the first calculator 401a is 

i 

instructied to perforin the calculation. For a branch 
instructp-on, a PC relative value is sent via the PC 
relative! value selector 420 to the lower PC calculator 405 

and uppeir PC calculator 411 that are then instructed to 

i 

update tjfae PC. The instruction decoders 409a ~ 409c send 
control signals showing the number of executed instruction 
units tO| have the INC 412 update the PC increment, and send 
control signals showing the number of executed instruction 
units toj the instruction buffer 408 to have the executed 
instruction units deleted from the instruction buffer 408. 

Th^ PC relative value selector 420 outputs the PC 
relative! value outputted by the instruction decoders 4 09a - 

4 09c to the lower PC calculator 405 and the upper PC 

I - ■ • 

calculatbr 411. 

The immediate selector 421 outputs an immediate 
outputteb by the instruction decoders 409a ^ 4 09c to the 

i 

41 



i 



general Agisters 402 and the calculators 401a ~ 401c. 

Thi INC 412 receives information regarding the number 
of executed instruction units via control signals sent by 
the instjruction decoders 409a ~ 4 09c, and increments the 
value of! the upper PC 403 and the lower PC 404 in 
accordance with this number. By doing so, the INC 412 sets 

1 

the packlet address of the first instruction in the set of 
instructions to be executed in the next cycle in the nnper 
PC 4 03 ahd the in-packet address of the first instruction 

I 

in the siet of instructions to be executed in the next cycle 

i 

in the lower PC 4 04. 

1 

The upper PC calculator 411 and lower PC calculator 
4 05 resjectively update the upper PC 403 and the lower PC 
404. Wt^en a branch instruction is decoded by the 
instructjion decoders 409a - 409c, the upper PC calculator 
411 and Slower PC calculator 405 respectively receive the 
upper 29, bits and the lower 3 bits of the PC relative value 
include4 in the branch instruction of the PC relative 
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jThe lower PC calculator 405 increascfs or decreases 
the present value of the lower PC 404 by the lower 3 bits 
in the ic relative value and sends the calculation result 
to the lower PC 404 as the new lower PC. The upper PC 
calculal^or 411 increases or decreases the present value of 
the upp4r PC 403 by the upper 29 bits in the PC relative 
value aiid sends the calculation result to the upper PC 403 

as the hew upper PC. This operation of the PC calculators 

! 

is descijibed later in this specification. As described 
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above, «]|ien a branch instruction is executed, the packet 
address k the branch destination instruction that is to be 
executed! next is set in the upper PC 403 and the in-packet 
address is set in the lower PC 404. There are also cases 
where the upper PC calculator 411 and lower PC calculator 
405 update the PC by calculating an address using a PC 
relative] value and an address stored in the general 

registers 402, 

The prefetch upper counter 410 shows the upper 29 
bits o£ the address of the first instruction in the set of 
instructions to be read from the instruction memory 407, 
which isj to say, the packet address. The prefetch upper 
counter j410 normally increments this value by one in each 
cycle. iWhen a branch instruction was executed in the 
previou^ cycle, the packet address of the branch 

destinatiion instruction set in the upper PC 403 is sent to 

i 

the prefetch upper counter 410 where it is set in place of 
the pre sjent value in the prefetch upper counter 410. 

Thje prefetch lower counter 413 shows the lower 3 bits 

of the siddress of the first instruction in the set of 

I 

instrucliions read from the instruction memory 407, which is 

t 

to say, ithe in-packet address. In this embodiment, the 
value ":|'b000" is set in the prefetch lower counter 413, 
As a result, the instructions to be read are indicated in 
packet |nits, so that one packet is sent from the 
instruction memory 407 to the instruction buffer 408 in 
each cycle. 
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The d&ta memory 406 stores operand data. 

Th^ operand data buffer 423 and operand address 
buffer 4^2 are buffers that are located between the data 
memory 406 and the processor. 

The following explains the incrementing method and 
calculating method for instruction addresses. This is the 
most chajracteristic feature of the present embodiment- 
1 

Incrementing Method for instruction Addresse s 

Thp incrementing of addresses is performed by adding 

an incr'^pent value to the in-packet address of an 

instruction, and adding any carry produced by the addition 

to the placket address. 

Fi!^. 7 is an increment table showing the rules used 

to increment the in-packet address. As shown in the 
figure, jwhen the in-packet address is •♦3'bOOO" or "3'bOlO", 
the incrementing of the instruction address is performed by 
adding 2 to the in-packet address. When the in-packet 
address lis "3'blOO", a carry to the packet address is 
produced (which is to say, 1 is to be added to the upper 29 
bits of ithe instruction address) and the in-packet address 
is updalied to "3'bOOO". This means that the incrementing 
of the in-packet address is a calculation that cycles 
through ithe three values "3'bOOO", "3'bOlO", and "3'blOO". 
As one ^xample, when the increment value is "2" and the 
value of the in-packet address before incrementing is 
••3'blOOt/ the packet address after incrementing is "3'bOlO" 
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and a cairy of "1" to the packet address is generated. 

Noie that in the present embodiment, the in-packet 
address does not need to be expressed in binary. This is 
especially effective when the number of instruction units 
in an instruction packet is not a power of 2. When this is 
the case[ it is not possible to express the position of an 
instruction unit in an instruction packet in binary and use 
a binary! ca-Lcuxciuj-^o»ii — . — ^ 

instruction unit. However, in the present embodiment, the 
position! of an instruction unit in an instruction packet is 
expressed using m different values. By using a calculation 
that cycjles through these m values, the specifying of 
instructjion units and the calculations for shifting the 
instructjion position can be achieved even if the number of 
instructiion units in an instruction packet is not a power 
of 2. i 
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Method fjor Calculating the Instruction Address 

Thfc following explains the carry method which is one 
of the d-3thods used for calculating the instruction 
addressejs in the present invention. Other methods used to 
calculate addresses are a separation method, an absolute 
positiori indicating method, and a linear addressing method, 
though tihese will be described later in this specification. 
In the cjarry method, the upper 29 bits and lower 3 bits of 
an instijuction address are calculated separately. However, 
when caiculating the upper bits, any carry to or from the 
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upper 29j bits that occurred when calculating the lower 3 

bits is taken into account. 

ihk following explains the method by which the 

present processor adds the address of a branch instruction 
and a PC; relative value to find a branch destination 
address.; The lower PC calculator 405 shown in Fig. 6 adds 
the lowJr 3 bits of the address of a branch instruction to 

^ , . . . _ ^ ^1. . T^/^ -I -,4-4 TT^. 1 ti a TTi rr . PA is an 

tfie xower j oxca oj. i-ixw j^-^ j.=-.-c^^.^ , v, =, 

addition table showing the addition rules used when adding 
the lowejr 3 bits of the address of a branch instruction to 
lower 3 jbits of the PC relative value. As shown in Fig. 
8A, this addition of the lower 3-bit values differs from a 
binary dalculation in being a calculation that cycles 
through 'the three values "3'bOOO", "3'bOlO", and "3'blOO". 
When a c^arry occurs as shown in Fig. 8A, the lower PC 
calculator 4 05 sends the carry to the upper PC value to the 
upper PC calculator 411. 

Th;e upper PC calculator 411 shown in Fig. 6 adds the 
upper 20 bits of the address of a branch instruction to the 
upper 2^ bits of the PC relative value. When doing so, if 
the calculation of the lower PC calculator 405 has resulted 
in a carry to the upper PC, the upper PC calculator 411 
also adds this carry. This addition is a normal addition 

i 

of binary values. 

The addition results of the lower PC calculator 405 
and upper PC calculator 411 form the address of the branch 
destinatrion instruction, . The addition result for the lower 

I 
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3 bits set in the lower PC 404 and the addition result 
for the jipper 29 bits is set in the upper PC 403. 

The following explains the calculations of the 
optimization apparatus 303, assembler 305, and linker 307 
for finding the PC relative value, which is to say the 
subtraction of the branch instruction address from the 
branch djestination address. Like the addition described 
above, this subtraction is performed separately for the 
upper 291 bits and lower 3 bits. The lower address 
sabtract|ion means 907 of the optimization apparatus 303, 
the low^r address subtraction means 806 of the assembler 
305, andi the lower address subtraction means 706 of the 
linker 307 subtract the lower 3 bits of the branch 
instruction address from the lower 3 bits of the branch 
destinatjion address. Fig. 8B is a subtraction table 
showing jthe subtraction rules used when subtracting the 
lower 3 jbits of the PC relative value from the lower 3 bits 
of a branch destination address. As shown in Fig. 8B, this 
subtractjion of the lower 3-bit values differs from a binary 
calculat|ion in being a calculation that cycles through the 
three values "3'bOOO", "3'bOlO", and "3'blOO". When a 
carry oicurs as shown in Fig. 8B, the lower address 
subtraction means that performs the calculation (such as 
lower address subtraction means 907) sends the carry from 
the upp4r PC value to the corresponding upper address 
subtracl|,ion means (such as upper address subtraction means 
910) . The various upper address subtraction means are 
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describeji in more detail later. 

Thi upper address subtraction means 910 in the 
optimization apparatus 303, the upper address subtraction 
means 809 in the assenibler 305, and upper address 
subtraction means 709 in the linker 307 subtract the upper 
29 bits 'pf the address of a branch instruction from the 
upper 29] bits of the address of the branch destination 
instruction. When doing so, if the calculation of the 
lower address subtraction means 907 (or similar) has 
resulted! in a carry from the upper PC, the upper address 
subtractjion means 910 (or similar) also subtracts this 
carry. iThis subtraction is a normal subtraction of binary 



I 

values. 

! 



Th^se subtraction results respectively form the lower 
3 bits cjnd the higher 29 bits of the PC relative value. 
This mel^hod is also used when the processor finds the 

i 

address !of a branch destination instruction by executing a 

t 

subtraction on the address of a branch instruction and a PC 

! 

relative? value. 

Thje optimization apparatus 303, assembler 305, and 
linker i07, which calculate a PC relative value from the 
difference between the address of a branch destination 
instrucxiion and the address of a branch instruction, and 
the procjessor 309, which calculates the address of a branch 
destinaijion instruction using this PC relative value, 
calculal^e addresses using the same carry method. As a 
result, iwhen executing a branch instruction, the processor 
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can correctly calculate the address of a branch destination 
instruction from the PC relative value. This address 
calculation method that uses a carry has a feature in that 
it can calculate addresses perform separate calculations 
for uppej: bits and lower bits while maintaining the 
continuity between the two. 



Optimization Apparatus 

Fi^. 9 is a block diagram showing the components and 
input/output data of the optimization apparatus 303 shown 
in Fig. This optimization apparatus 303 optimizes the 

assemblelr code 302 generated by the compiler 301, links the 
instructkon sequences together in packets of three 
instructjion units, and outputs the resulting optimized code 
304. Th|e optimization apparatus 303 includes a code 
optimiza^tion apparatus 902, an address assigning means 904, 
a label jdetecting means 905, a lower address subtraction 
means 96^7, an upper address subtraction means 910, an 
address idifference calculating means 912, and a label 
inf ormat^ion resolving means 914 . 

The code optimization apparatus 902 optimizes the 
assembler code 302 and so generates the optimization 
processing code 903. This processing of the code 
optimization apparatus 902 is the same as any well-known 
optimization apparatus, and so will not be described. 

Th'e address assigning means 904 estimates an address 
for eacli instruction in the optimization processing code 
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903 produced by the code optimization apparatus 902 and 
assigned, an estimated address to each instruction. These 
addressed are called provisional addresses in this 
specification. As a result, the address assigning means 

904 outputs the address assigned codes 916. 

The label detecting means 905 detects local labels 
from the! address assigned codes 916. On detecting a label 
^-^at shoiild be resolved by an instruction ?iH<^resB. the 
label detecting means 905 obtains the provisional address 

i 

of the instruction including this label. Conversely, on 
detecting a label that should be resolved by a PC relative 
value, the label detecting means 905 obtains the 
provisional addresses of the instruction including this 
label anjd the branch destination instruction. After this, 
the labell detecting means 905 outputs the label information 
906 that! shows the instructions that include labels and 
informat,ion on values for resolving these labels. 

Thje lower address subtraction means 907, the upper 

address Isubtraction means 910, and the address difference 

i 

calculat^ing means 912 calculate the PC relative values for 

labels, |in the label information 906, that should be 

i 

resolved by PC relative values. 

Th'e lower address subtraction means 907 subtracts the 

i 

lower 3 jbits of the provisional address of a branch 
instruction from the lower 3 bits of the provisional 
address jof the branch destination instruction and outputs 
the resulting carry value 908 and lower subtraction result 
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909. j 

Thfe upper address subtraction means 910 subtracts the 

upper 29i bits of the provisional address of a branch 

t 

instructiion and the carry value 908 calculated by the lower 
address Isubtraction means 907 from the upper 29 bits of the 
provisioinal address of the branch destination instruction 
and outp:uLs the resulting upper subtraction result 911. 

Thfe address dif f erenc.<=! n^i 1 nnl at-i r»g mpans Q.I 7 f inds 
the address difference 913 by setting the lower subtraction 
result 9109 calculated by the lower address subtraction 
means 9017 as the lower 3 bits and the upper subtraction 
result 911 calculated by the upper address subtraction 

i 

means 9lj0 as the upper 29 bits. 

Thp label information resolving means 914 converts an 
instruction in the optimization processing code 903 
includin|g the present label into an instruction of a 
suitable! size, based on an address that was estimated and 
assigned! by the address assigning means 904 or the address 
difference 913 found by the address difference calculating 
means 91|2, l£ the assigned address or the address 

difference 913 can be expressed using no more than 13 bits, 

> 

the labejl information resolving means 914 converts the 

f 

instructiion into a 21-bit instruction, or if not the label 
information resolving means 914 converts the instruction 
into a 4 2 -bit instruction. 

After the labels have been resolved, the label 

! 

information resolving means 914 links the instruction 

i 
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sequenceiS into packets of three instruction units and 
outputs the result as the optimized code 304 

Thp following describes a specific operation of the 
optimization apparatus 303. 

5 Fig- 10 is a flowchart showing the operation 

j 

procedure of the optimization apparatus . 

First, the code optimization apparatus 902 optimizes 
the assehibler code 302 and generates optimization 
processitng code 903. Part of the optimization processing 

to code 903; generated by the code optimization apparatus 902 

i 

is shownj in Fig. 11. Of the instructions in Fig. 11/ 
"Ll:mov r2,rl" 1000 shows the position of the label LI and 
is an instruction that indicates a transfer from register 
r2 to rejgister rl. The instruction "jsr f" is a function 

15 call thajt performs a relative branch to the label f (an 

externalj label) . A return from the function call to this 
address jis performed by a "ret" instruction. The 
instruction "add r0,r4" adds the values of registers rO and 
r4 and sjtores the result in register r4. The instruction 

20 "and rl^^rS" 1003 calculates a logical AND for the values in 
register^ rl and r3 and stores the result in register r3. 
The instruction "mov L2^r2" 1004 transfers the address of 
the instruction located at the label L2 into the register 
r2. Th^ instruction "Id (r2),r0" 1005 transfers the data 

i 

stored ajt the address stored in register r2 into the 
register] rO. The instruction "bra Ll" 1006 performs an 
indirect branch to the label Ll (a local label) . Note that 
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In Fig. 11, the instructions that continue after 
instructiion 1007 have been omitted, though these 
instructions do not include an instruction located at the 
label f i(step S9001) . 

Thte address assigning means 904 assigns a provisional 
address jto each instruction in the optimization processing 
code 903; and so generates address assigned codes 916. Fig. 
12 shows th« addtesa assigned codes yib generated from the 
optimizaition processing code 903 shown in Fig. 11. In this 

i 

example,! provisional addresses starting from the value 
"32 'bOOOjOOSOO" have been assigned (step S9002) . 

Th^ label detecting means 905 detects local labels in 
the adddess assigned codes 916 and outputs label 
informatjion 906 composed of instructions that include the 
detected labels and information on the values used to 
resolve ithose labels. Fig. 13 shows the label information 
906 that is generated from the address assigned codes 916 

shown iri Fig. 12. As shown in this figure, label L2 of 

! 

instruc-t;ion 1104 is detected as a label that should be 
resolved by an instruction address and label Ll is detected 
as a label that should be resolved by a PC relative value. 
Information showing the address for resolving the label L2 
is appei^ded to the instruction "mov L2,r2" that includes 
the label L2, and information showing the addresses of the 
branch destination instruction and branch instruction to be 
used for calculating a PC relative value is appended to the 
instruction "bra Ll" that includes the label Ll. Note that 
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since t4 label f in instruction 1101 is an external label, 
it is not optimized (steps S9003, S9004) . 

When the label information 906 includes a label that 
should be resolved by a PC relative value, processing to 
calculate this PC relative value is performed. The lower 
address pubtraction means 907 calculates the lower 3 bits 
of the v'alue shown by the label LI that is a PC relative 
value. The lower address subtraction means yu7 subtracts 
the lower 3 bits "3'bOlO" of the provisional address 
"32'h00o',00812" of the branch instruction 1106 from the 
lower 3 jbits "3'bOOO" of the provisional address 
"32'h000;00800" of the branch destination instruction 1100. 
As a res'jult, "1" is obtained as the carry value 908, and 
"3'blOO"i is obtained as the lower subtraction result 909 
{steps $9005, S9006) . 

The upper address subtraction means 910 calculates 
the uppe'r 29 bits of the value shown by the label Ll that 
is a PC jrelative value. The upper address subtraction 
means 9l|o subtracts the upper 29 bits "29 ' h00000102" of the 
provisic^nal address of the branch instruction 1106 and the 
carry value 908 "1" generated by the lower address 
subtraction means 907 from the upper 29 bits "29 ' hOOOOOlOO" 
of the E^rovisional address of the branch destination 
instruction 1100. As a result, "29 'hlf f f f f f d" ("-3" in 
base 10,1 minus numbers being hereafter shown using a 
complement) is obtained as the upper subtraction result 911 
(step S9007) . 
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Th| address difference calculating means 912 finds 
the addrfess difference, which is to say the PC relative 
value, by setting the lower subtraction result 909 as the 
lower bits and the upper subtraction result 911 as the 
upper bits. In this example, the address difference 
calculatllng means 912 sets "S'blOO" as the lower bits and 
"29'hlff-ffffd" as the upper bits, giving an address 
difference of "32 ' hf f f f f f ec" (step syuOB) . 

The label information resolving means 914 judges 
whether the value used to resolve the label in the label 
informal^ion 906 can be expressed by a 13-bit value. The 
value that resolves the label L2 shovm in Fig, 13 is 
"32'hl23|45678", so that this value cannot be expressed as a 
13-bit 4lue, meaning that instruction 1104 including this 
label L2 will become a 42-bit instruction. On the other 
hand, t^e value used to resolve label Ll is "32 'hf f f f f f ec" , 
which c^n be expressed by a 13-bit value. Accordingly, the 
instruction 1106 that includes label Ll will become a 21- 
bit instruction (steps S9009, S9010, S9011) . 

The label information resolving means 914 links the 
instructiion sequences into packets of three instruction 
units, based on the address assigned codes 916. When doing 
so, the ilabel information resolving means 914 converts 
instructjions that include labels into instructions of the 
determirled size. Here, one instruction unit is used for 
21-bit instructions, and two units are used for 42-bit 
instructions. After this, the label information resolving 
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means 91-1 outputs the instruction sequences that it has 
converted into packets as the optimized code 304. Fig. 14 
shows the optimized code 304 generated from the address 
assigned! codes 916 shown in Fig, 12. In Fig. 14, each row 
shows thk instructions that form one instruction packet, 
with the! marks "II" showing the boundaries between 
instructions in a packet. Curved brackets " 0 " are used in 
this drawing to indicate 42-bit instructions chaL each 
occupy two units (step S9012) . 

As; described above, addresses are estimated with a 
calculation method that uses a carry. In this way, a 
suitable^ optimization apparatus for a processor that uses a 
carry mejthod can be achieved. 

Nope that the provisional addresses assigned by the 
address Assigning means 904 and the PC relative values 
calculatjed by the address difference calculating means 912 
are values that are estimated for determining the sizes of 
all inst;ruction3 that include labels. There are cases when 
these estimates differ from the actual values, so that 

t 

these values are not used hereafter in the processing. 

j 

! 

Asscmblejr 

F±g. 15 is a block diagram that shows the 
construction of the assembler 305 shown in Fig. 5 and the 
input/o\itput data related to the assembler 305. This 
assembler 305 converts the optimized code 304 generated by 
the optimization apparatus 303 into relocatable codes 306 

i 
t 
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that have a relocatable address format. The assembler 305 
includes' a machine language code generating means 802, a 
label detecting means 804, a lower address subtraction 
means 806, an upper address subtraction means 809, an 
address difference calculating means 811. and a label 
information resolving means 813. The machine language code 
generatihg means 802 converts the optimized code 304 into 
machine ^^Language codes 803 that can be executed by the 
process©?: 309- However, labels whose values have not been 
resolvedi are not converted and are stored in the machine 
language! codes 803 as they are. The machine language code 
generatijng means 802 assigns a packet address and an in- 
packet ajddress to each machine language code. As described 
later, labels are later resolved using these addresses. 

Th^ label detecting means 804 finds a label that 
should b|e resolved by a PC relative value, which is to say, 
a differience in addresses between two instructions and 
obtains jthe addresses of the branch instruction and the 
branch cjestination instruction. After this, the label 
detecting means 804 outputs label information 805 that is 
compose^ of the instructions that include labels and the 
values tjhat resolve these labels. 

To| resolve the label information 805 obtained by the 
label detecting means 804, the lower address subtraction 
means 866, the upper address subtraction means 809, and the 
address ;difference calculating means 811 calculate a PC 
relativ^ value as follows. 
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Thi lower address subtraction means 806 subtracts the 
lower 3 bits of the address of a branch instruction from 
the lowel: 3 bits of the address of the branch destination 
instruction and outputs the carry value 807 and the lower 

subtraction result 808. 

The upper address subtraction means 809 subtracts the 
upper 29: bits of the address of a branch instruction and 
the carry value 807 calculated by the lower address 
subtraction means 806 from the upper 29 bits of the address 
of the bhranch destination instruction and outputs the 
resulting upper subtraction result BID. 

Th^ address difference calculating means 811 finds 
the addriess difference 812 by setting the lower subtraction 
result S'pS calculated by the lower address subtraction 
means 8016 as the lower 3 bits and the upper subtraction 
result sjlO calculated by the upper address subtraction 
means 80:9 as the upper 29 bits. 

Th^ label information resolving means 813 replaces 
the labels in the machine language codes 803 with the 
address .differences 812 calculated by the address 
difference calculating means 811, and outputs the resulting 
relocatable codes 306. 

Thb following explains a specific example of the 
processijng of the assembler 305 on receiving an input of 
the optijmized code 304 of Fig. 14 that has been outputted 
by the optimization apparatus 303. 

Fig. 16 is a flowchart showing the operation of the 
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assembler . 



First, the machine language code generating means 802 
converts; each packet in the optimized code 304 into machine 
language! codes 803 that are suited to the processor 309. 
However/' the machine language code generating means 802 
does not; convert labels whose values have not been 
resolvedl so that these labels are stored as they are in 
the mach?.ne language codes 803. Atter this, the machine 
language; code generating means 802 assigns packet addresses 
(hereafter also called "local packet addresses") and in- 
packet addresses to each instruction in the machine 
language! codes 803. Fig. 17 shows the machine language 
codes 8 Op that are generated from the optimized code 304 
Shown in! Fig. 14. Note that the actual machine language 
codes ar;e expressed in binary as sequences of zeros and 
ones, though for ease of understanding these machine 
language: codes are shown in Fig. 17 in mnemonic form. The 
parallel; execution boundary information 100 and the format 
information 101 will also be clear at this stage, but are 
not illustrated to simplify the figure. In Fig. 17, packet 

i 

addresses (local packet addresses) are assigned starting 

from the value "29 'hOOOOOOOO" . The label f in the 

i 

instruction "jsr f" in packet 1300, the label L2 in the 
instructjion "mov L2,r2" in packet 1301, and the label LI in 
the instruction "bra LI" in packet 1302 have not yet been 
resolved, so that these instructions are not converted 
(Steps SiSOO, S1501) . 

^ 59 



n 10 



15 



20 



25 



Neit, the label detecting means 804 detects labels, 
out of tie unresolved labels in the machine language codes 
803, whik are local labels that should be resolved by a PC 
relative! value, and obtains the address of the instruction 
including the label, which is to say, the branch 
instruction, and the address of the branch destination 
instructfon. The label detecting means 804 then outputs 
label information 805 that includes information snowing the 
instruction including the label and the value that resolves 
the label. Fig". 18 shovs the label information 805 that is 
generateil from the machine language codes shown in Fig. 17. 
Here, label Ll is detected as a local label that should be 
resolved! by a PC relative value, "32 •h00000012" is obtained 
as the a'ddress of the branch instruction, and 
"32'h00o!oo000" is obtained as the address of the branch 
destinatlion instruction (steps S1502. S1503) . 

Th^ lower address subtraction means 8 06 then 
calculates the lower bits of the value Ll that is a PC 
relativJ value- The lower address subtraction means 806 
subtractjs the lower 3 bits "S'bOlO" of the address 
"32'h00o|o0012" of the branch instruction 1409 from the 
lower s 'lbits "3'bOOO" of the address "32 ' hOOOOOOOO" of the 
branch cjestination instruction 1401. As a result, "1" is 
obtaineq as the carry value 807 and "3'blOO" is obtained as 
the lowJr subtraction result 808 (step S1504) . 

Nejxt, the upper address subtraction means 809 
calculates the upper bits of the value Ll that is a PC 
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relative j value. The upper address subtraction means 809 
subtracts the upper 2 9 bits "29 • h00000002" of the address 
of the branch instruction 1409 and the carry value 807 "1" 
from the; upper 2 9 bits "29 ' hOOOOOOOO" of the address of the 
branch destination instruction 1401. As a result, 
"29'hlf fff f fd" ("-3" in base 10/ minus nuKibers being 
hereafter shown using a complement) is obtained as the 
upper subtraction result 810 (step SibOb) . 

Th4 address difference calculating means 811 finds 
the addrpss difference, which is to say the PC relative 
value, by setting the lower subtraction result 808 as the 
lower bits and the upper sxobtraction result 810 as the 
upper bits. In this example, the address difference 
calculating means 811 sets "3'blOO" as the lower bits and 
"29'hlff|ffffd" as the upper bits, giving an address 
differenjce of "32 ' hf f f f f f ec" (step S1506) . 

Th^ label information resolving means 813 judges 
whether the address difference 812 can be expressed by only 

its lower 13 bits- I£ so, the label information resolving 
means 8l|3 sets the lower 13 bits of the address difference 
812 as tjhe PC relative value, or if not, the label 
information resolving means 813 sets the entire address 
difference 812 as the PC relative value. As a result, a 
label ii> the machine language codes 803 is converted into a 
PC relatjive value. The address difference that resolves 
label Ll in the label information in Fig. 17 is 
"32 'hf f ^fffec", which can be expressed by the lower 13-bit 
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value "IBlfec", so that the label Ll in the machine 
language! codes shown in Fig. 17 is converted into the lower 
13-bit v^lue. Fig. 19 shows the relocatable codes that are 
generated from the machine language codes 803 shown in Fig. 
17. In tig. 19/ the instruction 1609 has been produced by 
converting the label Ll into a PC relative value. Fig. 19 
shows the parallel execution boundary information 100 and 
format ipformation 101 of each instruction nhat had already 
been established when the machine language codes 803 were 
outputtek, and also shows the unused bit in each 
instruction packet {steps S1507, S1508, S1509) . 

As! described above, by finding a PC relative value by 
performing address calculation according to a carry method, 
an assembler corresponding to a processor that uses a carry 
method c^n be realized. 



i 

( 

) 

Linker 1 



Fi^. 20 is a block diagram showing the construction 

of the Ijinker 307 shown in Pig- 5 and the I/O 

( input/ ojatput) data of the linker 307. This linker 307 
combines! a plurality of relocatable codes 701, determines 
the addresses of each instruction, and outputs the object 
code 714! that is executable by the processor 309 and is in 
absolute; address format. The linker 307 includes the code 
coinbinin^ means 702, the relocation information detecting 
means 70j4, the lower address subtraction means 706, the 
upper address subtraction means 7 09, the address difference 



calculatlLng means 711. and the relocation information 

I 

resolvinb means 713. 

The code combining means 702 combines a plurality of 
inputted! relocatable codes 701 and determines the addresses 
of all instructions. The code combining means 7 02 then 
resolves! the labels that should be resolved by instruction 
addresse'p using the determined addresses and outputs the 
combined' codes 703 that result from its operation. 

The relocation information detecting means 704 
searches! for external labels that should be resolved by PC 
relative! addresses and obtains the addresses of branch 

1 

instructions and the branch destination instructions. 
After dokng so, the relocation information detecting means 

t 

704 outp'uts relocation information 705 includes information 
showing jinstructions that include labels and values to be 
used to jresolve the labels. To resolve the resulting 
relocatijon information 705, the lower address subtraction 
means 70|6, the upper address subtraction means 709, and the 

t 

address jdifference calculating means 711 calculate PC 
relative, values, as described below. 

Thfe lower address subtraction means 706 subtracts the 
lower 3 bits of the address of the branch instruction from 
the lowdr 3 bits of the address of the branch destination 
instructiion, and so generates a carry value 707 and a lower 
subtractjion result 708. 

The upper address siibtraction means 7 09 subtracts the 
upper 25 bits of the address of the branch instruction and 
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the carrir value 707 generated by the lower address 
subtractLon means 706 from the upper 29 bits of the address 
of the btanch destination instruction, and so generates the 

1 

upper subtraction result 710. 

Th4 address difference calculating means 711 sets the 
lower subtraction result 708 calculated by the lower 
address ^ubtraction means 7 06 as the lower 3 bits and the 
upper subtraction result 710 calculated by the upper 
address subtraction means 709 as the upper 29 bits to 
generate; the address difference 712. 

The relocation information resolving means 713 
replaces! labels in the combined codes 703 with address 
differences 712 calculated by the address difference 
calculating means 711, and outputs the resulting object 
code 308,. 

Th^ operation of the linker 307 is explained below 
using axi example where the relocatable codes 306 shown in 
Fig. 19 jthat have been outputted by the assembler 305 have 

i , 
been inp|utt.ea. 

Fi^. 21 is a flowchart showing the operation of the 
linker 3:07. 

First, the code combining means 702 combines a 
pluralitiy of relocatable codes 701. Fig. 23 shows the 
state wljen the relocatable codes 814 shown in Fig. 19 have 
been codbined with the relocatable code shown in Fig. 22. 
The code combining means 7 02 combines these relocatable 
codes with the packet address of the first relocatable code 
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in Fig. ^2 as "29 ' hOOOOOOOO" and the packet address of the 
first relocatable code in Fig. 19 as "29'hOOOOOOOl" (step 

S2000, S2001) . 

The addresses of all instructions are determined in 
this way; so that the code combining means 7 02 can resolve 
the addresses of labels that should be resolved by 
instruction addresses and then output the resulting 
combined; codes 703. Fig. 23 shows that the address of 
label L2' in instruction 1810 "mov L2,r2" is the starting 
address bf instruction packet 1815. This address has been 
set at "32'hl2345680", so that the code combining means 702 
uses thip value to replace the label L2. Fig. 24 shows the 
resultinig combined codes 7 03. In instruction 1910 in Fig. 
24, the 'label L2 has been replaced with this address 
"32'hl23i45680" (step S2002) . 

Ne^t, the relocation information detecting means 704 
finds external labels in the combined codes 703 that should 
be resolved by PC relative values and extracts the 
addresses of the instructions that include these labels and 
the addresses of the instructions where these labels are 
located,; which is to say, the addresses of branch 
instructions and branch destination instructions. After 
this, the relocation information detecting means 704 
outputs irelocation information 705 that is composed of 

i 

information showing the instructions including labels and 
the values to be used to resolve these labels. Fig. 25 
shows tlje label information that is generated from the 
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combined codes 703 shown in Fig- 24. Here, label f is 
found as an external label that should be resolved by a PC 
relative! value, so that "32 'hOOOOOOOa" is obtained as the 
address bf a branch instruction and "32 ' hOOOOOOOO" as the 
address bf the branch destination instruction (steps S2003, 
S2004) . i 

The lower address subtraction means 7 06 then 
calculates the lower bits of the value t that is a fc 
relative value. The lower address svibtraction means 706 
subtract^ the lower 3 bits "3'bOlO" of the address 
"32'hOOOjOOOOa" of the branch instruction 1906 from the 
lower 3 jbits "3'bOOO" of the address "32 'hOOOOOOOO" of the 
branch djestination instruction 1901. As a result, "1" is 
obtained as the carry value 707 and "3'blOO" is obtained as 
the lowejr subtraction result 7 08 (step S2005) . 

Next, the upper address subtraction means 709 
calculatjes the upper bits of the value f that is a PC 
relative value. The upper address subtraction means 709 
subtracts the upper 29 bits "2 9 ' h00000002" of the address 
"32 'hOOqOOOOa" of the branch instruction 1906 and the carry 
value 7ci7 "1" from the upper 29 bits "29'hOOOOOOOO" of the 
address jof the branch destination instruction 1901. As a 
result, ^"29'hlffffffe" is obtained as the upper subtraction 

i 

result "^10 (step S2006) . 

ThJe address difference calculating means 711 finds 
the addifess difference 712, which is to say the PC relative 
value, h>y setting the lower subtraction result 708 as the 



66 



lower bits and the upper subtraction result 710 as the 
upper bits. In this example, the address difference 
calculating means 811 sets "3'blOO" as the lower bits and 
"29'hlff f fffe" as the upper bits, giving an address 
difference of "32 ' hf f f f f f f 4" (step S2007) . 

Next, the relocation information resolving means 713 
converts' a label in the combined codes 7 03 into a PC 
relative'! value, setting the lower 13 bits of the address 
difference 712 as the PC relative value if this address 
difference 712 can be expressed by the lower 13 bits, or 
otherwise setting the entire address difference 712 as the 
PC relative value. The address difference that resolves 
the labeil f in the relocation information m Fig. 24 iS 
"32'hff f|ffff4", which can be expressed by the lower 13-bit 
value "lb'hlff4", so that the label f in the combined codes 
7 03 showjn in Fig. 23 is converted into this lower 13-bit 
value td produce the object code. The resulting object 
code is ishown in Fig. 26, In instruction 2106 in Fig. 26, 
the labql f has been conveirteci into the lower 13-bit value 
"13'hlff|4" (steps S2008, S2009, S2010) - 

As' described above, the present linker finds PC 
relativej values using an address calculation including a 
carry, and so is suited to a processor that uses a carry. 

Specifici Operation. of the Processor 

The following describes the operation of the 
processor when the object code shown in Fig. 26 has been 
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stored ±p the instruction memory 4 07. 

At; the start of execution of this object code, the 
upper Pd 403 is set at "29 ' hOOOOOOOO" and the lower PC 404 
is set at "3'bOOO"- The prefetch upper counter 410 
receives! an input from the upper PC 403 and so is set at 

I 

"29'hOOOpOOOO". 

Th4 read of instructions from the instruction memory 
4 07 is performed in packet units according to tne value in 
the prefetch upper counter 410. In detail, instruction 
packet 2ko0 that is indicated by the prefetch upper counter 
410 is rjead from the instruction sequence stored in the 
instructiion memory 4 07 and is stored in the instruction 
buffer 4i08. The value o£ the prefetch upper counter 410 is 
incremented by one in each cycle, and so here becomes 
"29'hOOojoOOOl". Hereafter, an instruction packet indicated 
by the prefetch upper counter 410 is read from the 

instructiion memory 407 and written into the instruction 

! 

buffer 4108 in each cycle. 

The following explains the operations for decodli^g 
and executing instructions for the case when instruction 
packet 2104 is indicated by the upper PC 403 and 
instruction 2107 in instruction packet 2104 is indicated by 
the lower PC 404. The instructions stored m the 
instructiion buffer 408 are interpreted by the instruction 

decoder J 409a ~ 409c. The first instruction decoder 4 09a 

i 

receive^ an input of the first unit, unit 2107, m the 
instruction packet 2104 and investigates whether unit 2107 
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is a onejunit instruction and whether there is a parallel 
executio^ boundary. Since unit 2107 is a one-unit 
instruction and there is no parallel execution boundary, 
the second instruction decoder 409b receives an input of 

j 

the next! unit, unit 2109, and investigates whether unit 
2109 is ^ one-unit instruction and whether there is a 
parallel; execution boundary. Since unit 2109 is a one-unit 
instruction and there is no parallel execution boundary, 
the third instruction decoder 4 09c receives an input of the 
next uniit and investigates whether this next unit is a one- 
unit insjtruction and whether there is a parallel execution 
boundary!. Since this unit is not a one-unit instruction, 
the thirtl instruction decoder 4 09c also receives an input 

i 

of the fpllowing unit. The third instruction decoder 409c 
then finbs that this following unit includes a parallel 

executiop boundary. As a result, the instructions 2107, 

I 

2109, and 2110 are executed in parallel • 

Th^ first instruction decoder 409a decodes the 
instructjion "add r0,r4" and outputs control signals to the 
first calculator 401a. The first calculator 401a adds the 
values of registers rO and r4 and stores the result in 
registerj r4 . The second instruction decoder 409b decodes 
the instruction "and rl,r3" and outputs control signals to 
the secoind calculator 4 01b. The second calculator 4 01b 
performsj a logical operation on the values of registers rl 

and r3, 'and stores the result in register r3. The third 

I 

instructjion decoder 409c decodes the instruction "mov 

j 
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32'hl234^680,r2" and so has the immediate "32 ' hl2345680" 

transferred into register r2. 

In! this case, the instruction decoders 409a ~ 409c 

inform the INC 412 that a total of four instruction units 

have been executed. The INC 412 increments the values in 
upper PCi 4 03 and the lower PC 404 by four units. As a 
result, the lower PC 404 becomes "3'bOOO", a carry of two 
to the u^per PC 403 is generated, and the upper PC 4 03 
becomes "29 'hOOOOOOOS" . This means that the first 
instruction to be executed in the next cycle is instruction 

2112. [ 

The first instruction decoder 40 9a receives an input 

of the f'irst unit, unit 2112, and investigates whether unit 

i 

2112 is la one-unit instruction and whether there is a 

i 

parallel; execution boundary. Since unit 2112 is a one-unit 
instruction and there is no parallel execution boundary, 
the second instruction decoder 4 09b receives an input of 
the nexti unit, unit 2113, and investigates whether unit 

i 

2113 is !a onc-unxt instruction and wHeth^r there is a 
parallel! execution boundary. Here, the second instruction 
decoder ;409b finds that unit 2109 is a one-unit instruction 
and that there is a parallel execution boundary. As a 

i 

result, the processor 309 finds that instructions 2112 and 
2113 car^ be executed in parallel. 

ThU first instruction decoder 4 09a decodes the 
instruction "Id (r2),r0", has the operand data, which has 
the value in register r2 as the operand address, read from 
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the data 



memory 406 and stored in register rO. The second 
instruction decoder 409b decodes the instruction "bra 
13'hlfecK and, since this is a branch instruction, updates 
the values in the upper PC 403 and lower PC 404 using the 
address bf the branch destination instruction. 

First, the address indicated by the upper PC 403 and 
ir.w^»r PC' 404 is amended. While a PC relative value shows 

i 

the difference in addresses between a branch instruction 
and its branch destination instruction, the upper PC 403 
and lower PC 404 show the address of the first address to 
be executed in the same cycle as the branch instruction, so 
that the- upper PC 4 03 and lower PC 404 are amended so that 
they indicate the address of the branch instruction. In 
detail, ithe INC 412 increments the values of the upper PC 
403 and jlower PC 404 by one unit to show that the branch 
instructjion 2113 is preceded by one instruction unit, the 
first instruction 2112. As a result, the lower PC 404 
becomes j"3'b010" and the upper PC 403 stays at 
"29 'h00c|00003" - 

Following this, the upper PC calculator 411 and the 
lower Pd calculator 405 add the PC relative value 

j 

"13'hlfec" obtained by the second instruction decoder 409b 
to the 4pper PC 403 and the lower PC 404. Here, the sign- 
extended 32-bit value "32 ' hf f f f f f ec" is used as the PC 
relative value. This addition is split into additions of 
the upp^r 2 9 bits and the lower 3 bits. 

Tl^e lower PC calculator 405 adds the lower 3 bits 
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"3'bl00"| of the PC relative value to the value "3'bOlO" of 
the lowej: PC 404. As a result, a carry of one and the 
lower calculation result "S'bOOO" are obtained. The lower 
PC calculator 405 sends the carry to the upper PC 
calculator 411, and sends the lower calculation result to 
the lowe^ PC 404. 

Next, the upper PC calculator 411 adds the upper 29 

j 

bits "29i'hlffffffd" of the PC relative value and the carry 
value "l!" received from the lower PC calculator 405 to the 
value "2f9'h00000003" of the upper PC 403. The upper PC 
calculator 411 sends the upper calculation result of 
"29'hOOOioOOOl" to the upper PC 403, which sends the value 
on to the prefetch upper counter 410. As a result of this 
processijng, the prefetch upper counter 410 is set at 
"29'hOOqooOOl", so that the next instruction packet to be 
prefetched will be instruction packet 2104. Also, since 
the uppejr PC 403 is "29 -hOOOOOOOl" and the lower PC 4 04 is 

"3'b000"|, the first instruction to be executed in the next 

cycle is instruction 2105. 

He'teafter, codes in the object code are successively 

read and executed in the same way, so that no explanation 

will be jgiven for the other instructions. 

Thjis completes the detailed explanation of the 

constructions of the processor 309, linker 307, assembler 

i 

305 and j optimization apparatus 303 shown in Fig. 5. A 

j 

conventional compiler can be used as the compiler 301, so 
that no I explanation of such will be given. 
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Notpe that while the processor of this embodiment 
includes! three instruction decoders 409a - 409c and three 
calculatbrs 401a - 401c/ the present invention is not 
limited to this construction, so that only one instruction 
decoder knd one calculator may by provided. It is also 
possible; for the functions of the optimization apparatus 
303 to be incorporated into the compiler 301, and to have 
the ot>3ect code jSUb generated from the source code by 

i 

the compiler 301, the assembler 305, and the linker 307. 

Ini the present etnbodiment, the prefetch lower coimter 
413 was described as having the fixed value of "B'bOOO**, 
though tihis need not be the case. As one example, this 
value ma|y be incremented by one in each cycle • This 
results !in one byte of data being read from the instruction 
memory 407 and written into the instruction buffer 4 08 in 
each cycle. 

j 
1 

Second Btobodiment 

ThU second embodiment of the present invention 
relates 'to a modification of the processor, optimization 
apparatus, assembler, and linker of the first emtoodiment. 
This modification uses a different value as the PC relative 

i 

value fc^r resolving labels in branch instructions. 

Iri the first embodiment, the PC relative value in a 

branch instruction is a difference in addresses between the 

i 

branch instruction and the branch destination instruction, 
while in this second embodiment, the PC relative value in a 
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branch iistruction is a difference between the address of 
the branth destination instruction and the address of the 
first inktruction in same set of instructions as the branch 
instruction . 

In! this way, the PC relative value has a slightly 
different meaning than in the present embodiment. However, 
if the devices used to generate a program (i.e., the 
optimization apparatus 303,- assenibler 305, and linker 307 
that calculate the PC relative value) use the same meaning 
as the device that executes the program (i.e., a processor 
that calpulates an address based on the PC relative value) , 
the proc'essor will be able to correctly change the program 
counter ito the address of a branch destination instruction 
when exejCUting a branch instruction. 

Thp following explains the optimization apparatus 
303, assiembler 305, linker 307, and processor. 

Thb label detecting means 905 of the optimization 
apparatus 303 generates the label information 906 for 
labels that should be resolved by PC relative values in the 
followirjg way. Instead of generating label information 
after obtaining the provisional addresses of the branch 
instruction and the branch destination instruction in the 
same way as in the first embodiment, the label detecting 
means 905 generates the label information 906 after 
obtaining the provisional addresses of the branch 
destination instruction and the address of the first 
instructfion in the same set of instructions as the branch 



I 
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instruction. In the same way as in the first embodiment, 
this labk information 906 is then xised to calculate the 
address difference 913 that is the difference between two 
provisioiial addresses and is used in the optimized code 
304- The assembler and linker also operate in this way. 

Th4 following describes a specific example of the 
object code 308 generated in this embodiment. 

Th$ assembler i^ut. replaces the laoel Xii m 

I 

instruction 1409 in the machine language codes shown in 
Fig. 17 With the subtraction value "I3h'lff0" produced by 
subtract'p-ng the address "32 • hOOOOOOlQ" of instruction 1408, 
which is^ the first instruction in same set of instructions 
as instruction 1409, from the address "32 ' hOOOOOOOO" of the 
branch destination instruction. In the same way, the 
linker 3j07 replaces the label f in instruction 1906 in the 

combined codes shown in Fig. 24 with the subtraction value 

i 

"13'hlff|8" produced by subtracting the address 
"32'h000|00008" of the instruction 1907, which is the first 
instructjion in same set of instructions as instruction 
1906, fijom the address "32 'hOOOOOOOO" of the branch 
destinatjion instruction. Fig. 27 shows that the PC 
relativ^ value of instruction 2213 differs from that shown 

i 

in Fig- i26. 

The following describes the processor of the present 
embodiment , 

The processor 309 executes object code that have been 
generated as described above. When the processor 309 



executes I a branch instruction, the PC relative value in the 
branch instruction is a difference in addresses between the 
branch distination instruction and the first instruction in 
same set; of instructions as the branch instruction. 
Accordingly, the processor 309 does not amend the values of 

i 

the uppek PC 403 and lower PC 404, and, in the same way as 
in the first embodiment, adds the PC relative value to the 
values in the upper PC 403 and lower PC 404 and updates the 
values in the upper PC 403 and lower PC 4 04 using the 
addition| results. When this processor 309 executes the 
object cbde shown in Fig. 27, the execution of instruction 
2213 results in the PC relative value "13hlff8" being added 
to the E^resent PC "32 ' hOOOOQOOS" , resulting in the PC being 
updated jto "32 'hOOOOOOOO" . 

As! described above, the processor of the present 
embodiment does not need to amend the value of the program 
counter jin the same way as in the first embodiment whenever 
a brancl^ instruction is executed. The address of a branch 
destinatjion infstruction can instead be obtained by directly 
adding ^ PC relative value to the PC. This reduces the 
total es^ecution time. 

I 

Third Embodi ment 
— ■ 1 ~ — ' 

Thje third embodiment of the present invention relates 
to a processor that can indicate the execution position o£ 
an instruction by fully utilizing the lower 3 bits of 

t 
( 

instructfion addresses. 
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1 

In ithe first embodiment the lower 3 bits of the 
instruction address are used to indicate a position that is 

one of three units. In the present embodiment, however, 

i 

full use! is made of these 3 bits by having them indicate 
one of eight units- 

Fig. 28A shows the construction of an instruction 
packet in the present embodiment. This instruction packet 
is composed of eight instruction units- Each instruction 
unit in an instruction packet is 8 bits long, so that the 
total lehgth of one instruction packet is 64 bits. The 
processor in this embodiment reads one instruction packet 
(64 bitsi) in one cycle. 

Fi^. 28B shows the types of instructions used in this 
embodiment. Each instruction is composed of 8-bit 

instructiion units, with there being one-, two-, three-, 

i 

four-, five-, and six-ixnit instructions . 

Fi^, 28C shows the relation between in-packet 
addresses and the instruction units in a packet. In the 
same wa^^ as in the first embodiment, a position -Ln an 
instructjion packet is indicated by the lower 3 bits of an 
instruction address • As shown in Fig* 28C, the in-packet 

i 

address |"3 *bOOO" indicates the first unit, the in-packet 

address ;"3 'bOOl" indicates the second unit, the in-packet 

address i"3 'bOlO" indicates the third unit, the in-packet 

address i"3'b011" indicates the fourth unit, the in-packet 

i 

address ;"3'bl00" indicates the fifth unit, the in-packet 

i 

address i "3 'blOl" indicates the sixth unit, the in-packet 

i 

i 



address "3'bllO" indicates the seventh unit, and the in- 
packet address "3'blll" indicates the eighth unit. 

As : described above, the processor of the present 

erobodimeht indicates the execution position of an 

! 

instruction making full use of the lower 3 bits of the 
instruction address. As a result, instructions can be 
executed: with a greater variation of execution units for 

Oi'itE; cyclic , 

I 

Fourth Einbodiment 

Th6 fourth embodiment of the present invention 
relates to a method for calculating instruction addresses 
without iusing a carry* 

i 

Thp first embodiment teaches a processor for 
executing a program, and an optimization apparatus, 
assembler, and linker for generating a suitable program. 

All of tiiese devices use a common method for calculating an 

i 

instruction address using a carry. This has the effect 

i 

that the processor can correctly generate the address of a 
branch destination instruction using a PC relative value - 
However/ this effect can be achieved if the processor, 
optimization apparatus, assembler, and linker use a common 
address jcalculation method that does not use a carry. 
This present embodiment relates to such a calculation 
method that calculates addresses without using a carry. 

i 

\ 

This calculation method that does not use a carry 

resembles the calculation method in the first embodiment in 

i 

! 
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that the; calculation of address is performed separately for 
the upper 29 bits and lower 3 bits. However, the present 

method diiffers by not using a carry. 

The following explains the method by which the 
processor finds the address of a branch destination 
instruction by adding the address of a branch instruction 
and a PC: relative value. The lower PC calculator 405 shown 
in Fig* j6 adds -che lower J r>its of the address of the 
branch instruction and the lower 3 bits of the PC relative 

value. 'Fig, 29A is an addition table showing the addition 

i 

rules for adding the lower 3 bits of the address of the 
branch instruction and the lower 3 bits of the PC relative 
value ijn the present calculation method. As shown in the 
figure, ;this calculation differs from a normal addition of 
binary Values in that it cycles between the three states 
"3^b000"';, "3'bOlO", and "3'blOO". Note that no carry is 
generatejd, 

Th^ upper PC calculator 411 shown in Fig. 6 adds the 
upper 2^ bits of the address of the branch instruction and 
the uppeir 2 9 bits of the PC relative value. This is a 
normal addition of binary values. 

Thp results of the above additions form the address 
of a branch destination instruction. In detail, the 
additiori result for the lower 3 bits is set in the lower PC 

4 04 and ithe addition result for the upper 2 9 bits is set in 

j 

the uppe;r PC 403, 

The following explains the method used by the 



optimization apparatus, assembler, and linker to calculate 
the PC relative value, which is to say, to subtract the 
address of the branch destination instruction from the 
address of the branch instruction. This subtraction is 
split into an upper 29 bits and lower 3 bits like the 
additioni performed by the processor. The lower address 
subtraction means 907 of the optimization apparatus 303, 
the lower address subtraction means 806 of the assembler 
305, and the lower address subtraction means 706 of the 
linker 3^7 subtract the lower 3 bits of the address of a 
branch instruction from the lower 3 bits of the address of 
the branch destination instruction. Fig. 29B is a 
subtractjion table showing the s\jbtraction rules- for 
subtractjing the lower 3 bits of the address of the branch 
instructjion from the lower 3 bits of the address of the 
branch c^estination instruction. As shown in the figure, 
this calculation differs from a normal subtraction of 
binary values in that it cycles between the three states 

"3'b000"i, "3'bOlO", and "3'blOO". Note that no carry is 

i 

generated. 

The upper address subtraction means 910 of the 
optimization apparatus 303, the upper address subtraction 
means 809 of the assembler 305, and the upper address 
subtractjion means 709 of the linker 307 subtract the upper 
29 bits iOf the address of the branch instruction from the 
upper 29 bits of the address of the branch destination 
in3truc-tj;ion. This is a normal subtraction of binary 
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values. ; 

The PC relative value is then found by setting the 
result of the above subtraction for the lower 3 bits as the 
lower 3 bits and the result of the above subtraction for 
the uppet 29 bits as the upper 29 bits. 

Fii. 30 shows the object code that is generated by 

the abovte address calculation method of the present 

I 

eiTrbodiirient that dot:£> uOt Uae a carry. The PC relative 
values of instructions 2406 and 2413 differ to those in 
Fig. 2 6. 1 The following explains the calculation of the PC 
relative value of instruction 2406. 

The lower address subtraction means 7 06 subtracts the 
lower 3 jbits "3*b010" of the address of instruction 2406 
from the! lower 3 bits "3'bOOO" of the address of 
instructjion 2401 in accordance with the subtraction table 
shown ii4 Fig. 29B, This produces the lower siibtraction 
result "|3'bl00", 

Thje upper address subtraction means 709 subtracts the 
upper 29 bits "29 'hOOOOOOOl" of the address of instruction 
2406 frcjm the upper 29 bits "29 'hOOOOOOOO" of the address 
of instruction 2401. This produces the upper subtraction 
result '!29'hlfffffff". 

Tiie address difference calculating means 711 
generates the address difference "32 ' hlf f f f f f C by setting 
the uppt^r subtraction result "29 'hlf f f f fff " as the upper 29 
bits an<^ the lower subtraction result "3'blOO- as the lower 

j 

3 bits. j 



The relocation information resolving means 713 judges 
that the. address difference "32 'hlff f f f f c" can be expressed 
by just the lower 13 bits "13'hlffc" and so replaces a 
label with this value "13'hlffc" as a PC relative value to 
5 generate) instruction 24 06. 

The processor 309 executes the object code generated 

i 

as described above. When executing a branch instruction, 
the processor 309 adds the upper PC 403 and lower PC 4 04, 
which have been amended to correctly indicate the branch 

13 10 instruction, to the PC relative value in the branch 

ij instruction without generating a carry. 

H When the processor 309 executes instruction 2406 in 

iU the object code shown in Fig. 30, the lower PC calculator 

.ft 4 05 adds the amended lower PC 404 "3'bOlO" and the lower 3 

U IS bits "3'jblOO" of the PC relative value and updates the 

# lower Pg 404 to the resulting addition value "3*b000". The 

iU ! 

1^ upper pq calculator 411 adds the amended upper PC 4 03 

g "29'h00900001" and the upper 29 bits "29 »hlf f f f f f f " of the 

PC relative value and updates the lower PC 404 to the 
20 resultarjig addition value "29 ^ hOOOOOOOO" . 

described above, the present calculation method 
can calculate addresses without a carry being sent between 
the lower PC calculator 4 05 and the upper PC calculator 
411. This means that address calculation can be performed 
25 with a simpler hardware construction. 



82 



Fifth Embodiment 

The fifth embodiment of the present invention teaches 
a method; for calculating instruction addresses using 
absolute' values. 

This calculation method that uses absolute values 

j 

re'sembles the calculation method in the first embodiment in 
that the; calculation of address is performed separately for 
the upper 29 bits and lower 3 bits. However, the present 
method differs from the carry method in that the value of 
the lowe;r 3 bits of an instruction address are set as the 
lower 3 bits of the calculation result. 

The following explains the method by which the 

processor finds the address of a branch destination 

i 

instructiion by adding the address of a branch instruction 
and a Pq relative value. The lower PC calculator 405 shown 
in Fig. 6 adds the lower 3 bits of the address of the 
branch instruction and the lower 3 bits of the PC relative 
value. ;Fig. 31A is an addition table showing the addition 
rules fdr adding the lower 3 bits of the address of the 
branch instruction and the lower 3 bits of the PC relative 
value iri the present calculation method that uses absolute 
values- i As shown in the figure, the lower 3 bits of the PC 
relative value are set as the lower 3 bits of the addition 
result. I 

The upper PC calculator 411 shown in Fig. 6 adds the 

1 

upper 29 bits of the address of the branch instruction and 
the upper 29 bits of the PC relative value. This is a 
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normal a<fidition of binary values. 

Th4 results of the above additions form the address 
of a bratich destination instruction. In detail, the 
addition: result for the lower 3 bits is set in the lower PC 
404 and the addition result for the upper 29 bits is set in 
the uppei: PC 403, 

The following explains the method used by the 
optimization apparatus 303, assembler 305, and linker 307 
to calculate the PC relative value, which is to sayy to 
subtract! the address of the branch destination instruction 
from the' address of the branch instruction. This 
subtraction is split into an upper 29 bits and lower 3 
bits, like the addition performed by the processor. The 
lower adciress subtraction means 907 of the optimization 
apparatus 303, the lower address subtraction means 806 of 

the asseinbler 305, and the lower address subtraction means 

\ 

706 of tjhe linker 307 subtract the lower 3 bits of the 
address pf a branch instruction from^ the lower 3 bits of 
the addr,es3 of the branch destination instruction. Fig. 
31B is a; subtraction table showing the subtraction rules 
for subtjracting the lower 3 bits of the address of the 
branch instruction from the lower 3 bits of the address of 
the branch destination instruction in this calculation 
method that uses absolute values* As shown in the figure, 
the loweir 3 bits of the branch destination address are set 
as the SjUbtraction result for the lower 3 bits. 

The upper address s\abtraction means 910 of the 



• 1i 

optimization apparatus 303, the upper address subtraction 
means 809 of the assembler 305, and the upper address 
subtraction means 709 of the linker 307 subtract the upper 
29 bits Df the address of the branch instruction from the 
upper 29 bits of the address of the branch destination 
instruction. This is a normal subtraction of binary 

I 

values. ' 

Th^ PC relative value is then found by setting the 
result of the above subtraction for the lower 3 bits as the 
lower 3 bits and the result of the above subtraction for 

the uppdr 29 bits as the upper 29 bits. 

I 

Fig. 32 shows the object code that is generated by 
the above address calculation method of the present 
embodiment that uses absolute values. The PC relative 
values Of instructions 2606 and 2 613 differ to those in 
Fig. 26.1 The following explains the calculation of the PC 
relative value of instruction 2605, 

The lower address subtraction means 7 06 subtracts the 
lower 3 bits "3'bOlO" of the address of instruction 2406 
from the lower 3 bits "3'bOOO" of the address of 

instruction 2401 in accordance with the subtraction table 

i 

shown in Fig. 31B. This produces the lower subtraction 
result 73'bOOO". 

The upper address subtraction means 7 09 subtracts the 
upper 29 bits "29'hOOOOOOOl" of the address of instruction 
2406 fr9m the upper 29 bits "29 ' hOOOOOOOO" of the address 
of instruction 2401. This produces the upper subtraction 

i 
i 
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result "29'hlfffffff". 

i 

The address difference calculating means 711 
generates the address difference "32 'hlf f f f f f 8" by setting 
the upper subtraction result "29 ' hlf f f f f f f " as the upper 29 
5 bits and: the lower subtraction result "3'bOOO" as the lower 

I 

3 bits- 

Thfe relocation information resolving means 713 judges 
that thai address difference "32 'hlf f f f f f 8" can be expressed 
by just the lower 13 bits "13 ' hlf f 8" and so replaces a 

3 10 label wi;T:h this value "13 ' hlf f 8" as a PC relative value to 

% generate instruction 2606. 

3 The processor 309 executes the object code generated 

a as desci^ibed above. When executing a branch instruction, 

= the processor 309 adds the upper PC 403 and lower PC 404, 

15 which have been amended to correctly indicate the branch 
S instruction, to the PC relative value in the branch 

I instructjion using the present absolute value method. 

i When the processor 309 executes instruction 2606 in 

the object code shown in Fig, 32^ the lower PC calculator 
20 405 adds the amended lower PC 404 "3'bOlO" and the lower 3 
bits "3'bOOO" of the PC relative value and updates the 
lower PC 404 to the resulting addition value "3'bOOO"- The 
upper PC calculator 411 adds the amended upper PC 4 03 
"29'hOOOOOOOl" and the upper 29 bits "29 'hlf f f f f f f " of the 
25 PC relative value and updates the lower PC 404 to the 
resulting addition value "29 'hOOOOOOOO" . 

As described above, the present calculation method 
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can calculate addresses without needing to calculate the 
lower bits, so that the speed for calculating addresses can 
be improved - 

Sixth Embodiinent 

Th^ sixth embodiment of the present invention relates 
to fi linear calculation method for addresses* Unlike the 
Other embodiments, this linear calculation method 
calculates instruction addresses without splitting the 
calculatiion into an upper 29 bits and lower 3 bits. 

The following explains the present method for finding 
the address of a branch destination instruction from the 
address io£ a branch instruction and a PC relative value. 
While tt^e processor that uses the carry method is equipped 
with an ;upper PC calculator 411 for calculating the upper 
29 bits and a lower PC calculator 405 for calculating the 
lower 3 ;bits, a processor that uses the present linear 

i 

calcula-t;ion method is only equipped with one PC calculator 
for calqulating a 32-bit address. The PC calculator in 
this linear calculation method adds a 32-bit address of a 
branch instruction and a 32-bit PC relative value. This 
caloulai;ion is a normal binary addition, 

Th^e addition result of the PC calculator is set as 
the address of the branch destination instruction. This 
means that the lower 3 bits of the addition result are set 

in the lower PC 404 and the upper 2 9 bits of the addition 

i 

result are set in the upper PC 403. 



87 



The following explains the calculation of the PC 
relative value by the optimization apparatus 303, assembler 
305, and; linker 307, which is to say, the subtraction of 
the addrUs of the branch instruction from the address of 
the branch destination instruction. Like the processor in 
this embodiment, the optimization apparatus 303, assembler 
305, and; linker 307 are each provided with only one 
calculator, the address subtraction means, for calculating 
a 32-bit. address. The address subtraction means in this 
linear calculation method subtracts the address of a branch 
instruction from the address of a branch destination 
instruction. This calculation is a normal binary 
subtraction. The subtraction result is then set as the PC 

relative' value. 

Fig. 33 shows the object code that has been generated 
using the linear calculation method of the present 
embodiment. In Fig. 33, the PC relative values in 
instructions 2706 and 2713 differ to those shown in Fig. 
26. ThJ following describes the method for calculating the 
PC relative value for instruction 27 06. 

The address subtraction means in the linear 
calculat^ion method subtracts the 32-bit address- 
"32'h006o0000" of instruction 2701 from the 32-bit address 
"32'h0o6o000a" of instruction 2706 and so obtains the 
address difference "32 'hf f f f f f f 6" . 

Ti)e relocation information resolving means 713 judges 
that the address difference "32 'hf f f ff f f 6" can be expressed 

I 
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by just. its lower 13 bits "13'hlff6", and so replaces the 
label with "13'hlff6" as the PC relative value to generate 
instruction 2706. 

The processor 309 executes the object code generated 
as described above. When executing a branch instruction, 
the processor 309 adds the upper PC 403 and lower PC 404 
that have been amended to indicate the address of the 
branch ir.struction to the PC relative value using the 
present linear calculation method. 

When the processor 309 executes instruction 2706 in 
the object code shown in Fig. 33, the PC calculator in this 
embodiment adds a 32-bit PC value "32 'hOOOOOOOa", which has 
the amended value of the upper PC 4 03 as the upper 2 9 bits 
and the amended value of the lower PC 404 as the lower 3 
bits, to the PC relative value "32 ' hf f f f f f f 6" and so 
obtains ;the addition result "32 • hOOOOOOOO" . After this, 
the PC qalculator updates the lower PC 4 04 to the lower 3 
bits "3'bOOO" of this addition value, and the upper PC 403 
to the upper 29 bits "29 ' hOOOOOOOO" of this addition value, 

In| this way, the present linear calculation method 
can calculate addresses using a standard calculator as the 
PC calculator. This simplifies the structure of the 
processor . 

Seventh Embodiment 

The seventh embodiment of the present invention 
relates ;to a processor that interprets and executes PC 

i 
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adding instructions and PC subtracting instructions and to 
a compiler that generates such instructions. 

Fi^, 34 shows the processor of the present 
embodiment. The processor of the present embodijfaent 
differs from the processor in the first embodiment in that 
it further includes a second lower PC calculator 2800 and a 
second upper PC calculator 2802 and in that the first 
instruction decoder 2801a, the second instruction decoder 
2801b, and the third instruction decoder 2801c are all 
provided with new functions. 

The instruction decoders 2801a - 2801c are provided 
with an extra function for decoding PC adding instructions 
and PC s^ibtracting instructions* Fig. 35A shows the 
operation that corresponds to a PC adding instruction which 
is shown in irmemonic form. As shown in Fig. 35A, a PC 
adding i^nstruction adds a PC relative value "disp" to the 
value of the PC that is stored in a register and stores the 
addition result in the same register. Fig. 35B shows the 
operation that corresponds to a PC sxibtracting instruction 
which is shown in mnemonic form. As shown in Fig. 35B, a 
PC adding instruction subtracts a PC relative value "disp" 
from the value of the PC that is stored in a register and 
stores tjhe subtraction result in the same register. 

The second lower PC calculator 2800 and the second 
upper Pq calculator 2802 perform the PC adding instruction 
and PC subtraction instruction described above, using the 
same calculation rules as the lower PC calculator 405 and 



t 
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the upper FC calculator 411 described in the first 
embodiment . 

Fig. 36 shows the construction of the compiler of the 
present embodiment. 

Thje source code 2901 is a program written in a high- 
level language such as C* 

This intermediate code converting unit 2902 converts 
the source code 2901 into intermediate code 2903 which is 
an internal expression for the compiler. This intermediate 
code converting unit 2902 is a well-known technology and so 
will not be described. 

The PC value adding instruction converting unit 2904 
converts each intermediate code in the intermediate code 
2903 that adds a value of the PC and a variable into an 
assembler code 2906 for a PC adding instruction that is 
shown in Fig. 34. 

The instruction converting unit 2905 converts the 

i 

other intermediate codes into assembler code 2 906. This 

instruction converting unit 2 905 is a well-known technology 

t 

and so will not be described. 

The following describes a specific example of the 
operation of the present compiler. Fig, 37 is a flowchart 
showing jthe operation of this compiler. 

First, the compiler receives an input of source code* 
Fig. 38 shows source code which is written in C language. 
In Fig, 38, the external functions gl, g2, g3, and g4 are 
declare^, and the function f is defined as a function that 



receives the int-type variable "i". This function f 
includes code that substitutes the address of function gl 
into the pointer fp if the value of "i" is 1, substitutes 
the address of function g2 into the pointer fp if the value 
5 of "i" is 2, substitutes the address of function g3 into 
the pointer fp if the value of "i" is 3, substitutes the 
2dd.r<2ss of function ^4 into th^ point^^r fp if "'rhe value of 
"i" is 4!, and finally calls the function indicated by the 
pointer fp (step 3600) . 
p 10 Next, the intarmediate code converting unit 2902 

'{^ converts the source code into intermediate codes. When 

doing so, the intermediate code converting unit 2902 
K coverts (a) a source code that substitutes a pointer to an 

Sirs' 

,2 external function into a pointer variable into (b) an 

iL^ 15 intermediate code that adds the difference between the 

;f: address pf the start of present function and the address of 

IM the start of the external function to a temporary variable 

|S that stdres the address of the start of the present 

function, and substitutes the addition result, into th^ 
20 pointer ivariable. 

Fig. 39 shows the intermediate codes that have been 
generated from the source program shown in Fig. 38, The 
intermediate code 3201 shown in Fig. 39 is an intermediate 
code that has the label f marking the start of the function 
25 and that substitutes the present value of the PC, which is 
to say, !the first address of function t, into the temporary 
variable tmp. The intermediate code 3202 is intermediate 
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code that judges whether the value of variable i is not 
"1". The intermediate code 3203 is an intermediate code 
that branches to the label L when the judgement by 
intermediate code 3203 is true, that is, variable i is not 
"1"* The intermediate code 3204 is executed when variable 
i is "1", and adds a difference, obtained by subtracting a 
first address of function f from the first address of 
function gl, to the temporary variable tmp into which the 
first address of function f has been substituted, and has 
the addition result substituted into the variable fp. The 
intermediate code 3205 is an intermediate code that 
branches to the label 1. 

The intermediate code 3206 includes the label Ll, and 
is an intermediate code that judges whether variable i is 
not equal to "2". The intermediate code 3207 branches to 
label L2 when the judgement in intermediate code 32 06 is 
true, which is to say, when variable i is not "2". The 
intermediate code 3208 is executed when variable i is equal 

to "2", .and is an intermediate code that adds a difference, 
obtained by subtracting a first address of function f from 
the firs^t address of function g2, to the temporary variable 
tmp into which the first address of function f has been 

i 

substitTj^ted, and has the addition result substituted into 
the variable fp. The intermediate code 3209 is an 
intermec^iate code that branches to the label L. 

The intermediate code 3210 includes the label L2, and 
is an intermediate code that judges whether variable i is 



not equal to "3". The intermediate code 3211 branches to 
label L3 when the judgement in intermediate code 3210 is 
true, which is to say, when, variable i is not "3". The 
intermediate code 3212 is executed when variable i is equal 
to "3", and is an intermediate code that adds a difference^ 
obtained by subtracting a first address of function f from 
the first address of function q3, to the temporary variable 
tmp intq which the first address of function f has been 
substituted/ and has the addition result substituted into 

the variable fp. The intermediate code 3213 is an 

i 

intermediate code that branches to the label L. 

The intermediate code 3214 includes the label L4/ and 
is an intermediate code that adds a difference, obtained by 
subtracting a first address of function f from the first 
address iof function g4/ to the temporary variable tmp into 
which thfe first address of function f has been substituted, 
and has |the addition result substituted into the variable 
fp* The intermediate code 3215 includes the label L and is 
an intermediate code that calls the fxmction indicated by 
the variable fp* 

AS; described above, the intermediate codes in Fig. 39 

i 

do not simply substitute the absolute address of the 
functiorj gl, g2, g3 or g4 into the variable fpr but instead 
add a difference between the first address of function f 
and the 'first address of one of the functions gl, g2, g3, 
and g4 to the first address of the function f and 
substitute the addition result into the variable fp (steps 
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S3601 S3603) . 

Next, the PC value adding instruction converting unit 
2904 converts the intermediate codes into assembler code* 
The PC '-^.lue adding instruction converting unit 2904 
searches; for intermediate codes that add the value of the 
PC to a PC relative value and converts such codes into 
assembler code that uses the second lower PC calculator 
2 800 and the second upper PC calculator 28 02* The 
instruction converting unit 2905 then converts the 
remainirig intermediate codes into assembler code. 

The PC value adding instruction converting unit 2 904 
ascertains that the operand tmp in intermediate code 3204 
in Fig. :39 has been set at the value of the PC by the 
intermeqiate code 3201 and that the operator indicates 
an addition of the value of the PC and a PC relative value, 
and so converts intermediate code 3204 into the assembler 
code addpc that performs an addition using the second lower 
PC calculator 2800 and the second upper PC calculator 2802. 
In the same way^ the PC value adding instruction converting 
unit 2904 converts intermediate codes 3208, 3212; and 3214 
into assembler codes addpc. The other intermediate codes 
in Fig- ,39 are converted into assembler codes by the 
instruction converting unit 2905. 

Fig* 40 shows the assembler code that has been 
produced by converting the intermediate codes shown in Fig, 
39. In jFig. 40, the assembler code 3301 has the label f 
marking ;the start of a function and is an instruction that 

i 
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transfers the value of the PC into register rl- The 
assembler code 2802 is an instruction that judges whether 
the constant and the value of register rO are not 

equal. The assembler code 3303 is an. instruction that 
branches to label LI when the judgement in assembler code 

i 

2802 is true. The assembler code 3304 has the second lower 
PC calculator 2800 and the second upper PC calculator 2802 
add the ;PC relative value that is the difference between 
the firs|t address of function gl and the first address of 

function f to the value of the PC which is the first 

i 

address of function f and is stored in the register rl, and 
has the iresult transferred into register rl. The assembler 
code 330|5 is an instruction that branches to the label L. 

The assembler code 3306 has the label Ll and is an 
instruction that judges whether the constant "2" and the 
value of register rO are not equal. The assembler code 
3307 is lan instruction that branches to label L2 when the 
judgement in assembler code 3306 is true. The assembler 

code 3308 has the second lower PC calculator 2800 and the 

i 

second i^pper PC calculator 2802 add the PC relative value 
that is ithe difference between the first address of 
functior^ g2 and the first address of function f to the 
value of the PC which is the first address of function f 
and is stored in the register rl, and has the result 
transferred into register rl. The assembler code 3309 is 
an instruction that branches to the label L. 

The assembler code 3310 has the label L2 and is an 
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instruction that judges whether the constant "3" and the 

i 

value of^ register rO are not equal- The assembler code 
3311 is an instruction that branches to label L3 when the 
judgement in assembler code 3310 is true. The assembler 
code 3311 has the second lower PC calculator 2800 and the 
second upper PC calculator 2802 add the PC relative value 
that is the difference between the first address of 
function g3 and the first address of function f to the 
value of the PC which is the first address of function f 
and is stored in the register rl, and has the result 
transferred into register rl. The assembler code 3313 is 
an instruction that branches to the label L. 

The assembler code 3314 has the label L3 and is an 
instruction that has the second lower PC calculator 2800 
and the Isecond upper PC calculator 2802 add the PC relative 
value that is the difference between the first address of 
functioni g4 and the first address of function f to the 
value the PC which is the first address of function f 

and is qtored in the register rl, and has the re&ialt 

transfeijred into register rl. The assembler code 3315 has 
the label L and is an instruction that calls the function 
indicated by register rl. The assembler code 3316 is an 
instruction that ends the function. 

As! described above, when there is a source code in 
function f that substitutes a pointer to the external 
function g into a pointer variable, the present compiler 
does not; generate an instruction (such as "mov rl,g") that 
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transfers the address of the external function g into 
register; rl, but instead generates an instruction (addpc g- 
f , rl) that has adds a difference (g-f ) in addresses 
between |function f and function g to the address of 
function f that is stored in register rl, and has the 
result transferred into register rl. Since the value of 
the PC relative value g-f is smaller that the absolute 
address g, the overall code size of programs can be reduced 
by using such addpc instructions- This has a further 
benefit for PIC codes where the addresses of a program in 
memory are determined when the program is executed, since 
calculation instructions that use such PC relative values 
must be used. 

In the same way as in the first embodiment, the 
assembler code produced by the compiler of the present 

embodimejht is converted into object code that can be 

t 

executed^ by the processor by an optimization apparatus 303, 
an assembler 305 and a linker 307* The processor executes 
the PC adding instruction "addpc g-f,rl" in the generated 
object code using the second lower PC calculator 2800 and 
the second upper PC calculator 2802, In detail, the second 
lower PC calculator 2800 adds the lower 3 bits of the 
constant! "g-f" and the lower 3 bits of the value stored in 
register rl and sends any carry that is generated to the 
second upper PC calculator 2802. The second upper PC 
calculator 2802 adds the upper 29 bits of the constant "g- 
f", the upper 29 bits of the value stored in register rl, 
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and any carry it has received from the second lower PC 
calculator 2800. A value given by setting the addition 
result of the second lower PC calculator 2800 as the lower 
3 bits ahd the addition result of the second upper PC 
calculator 28 02 as the upper 29 bits is then set in 
register: rl. 

Note that while the instructions shown in Pig- 35A 
and 35B respectively are an addition and a s;abtraction of a 
constant and the value in a register, this need not be the 

i 
1 

case. An addition and a subtraction of values in 
registers, or an addition and a subtraction of a value in a 
register and the PC may equally be used. 

The calculation method used by the second lower PC 
calculat;or 2800 and the second upper PC calculator 2802 
also need not be the carry method used in the first 
embodiment. Provided the same method is used by the 
optimization apparatus 303, assembler 305, and linker 307 
that gerierate the object code to be executed by the 
processc^r, any of a no-carry method, a linear method, and 
an absolute value method may be used. 

Eighth Embodiment 

The eighth embodiment of the present invention 
relates to a debugger and a disassembler. 

Fi;g. 41 is a block diagram showing the construction 
of the debugger and disassembler of the present embodiment. 

Th;e input control unit 4000 receives an input from 
i 
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the user| and controls the other components according to 
this inpiit- 

The packet address specifying unit 4001 calculates 

i 

the uppep: 29 bits of the address of the inputted 
instruction. 

The in--packet address specifying unit 4 002 calculates 
t-.he lower 3 bit.s of the address of the inputted 
instruction. 

The instruction memory 4004 stores the instructions 
to be processed by the debugger and disassembler* As in 
the first embodiment, the addresses of instructions are 32 
bits in length and are composed of a packet address as the 
upper 29\ bits and an in-packet address as the lower 3 bits. 
Fig. 41 phows how the instructions shown in Fig. 25 are 
stored. : 

The instruction reading unit 4003 reads an 

t 

instruction packet indicated by the packet address 
specifiejd by the packet address specifying unit 4001 from 
the insticuction memory 4004. 

The instruction buffer 4005 stores the instruction 
packet r^ad from the instruction memory 4 004 by the 
instructjion reading unit 4 003. 

Th^ instruction decoding unit 4 006 extracts the 
instructjion unit with the in-packet address specified by 
the in-p'acket address specifying unit 4 002 from the 
instructjion buffer 4005 and decodes the extracted 
instruction unit. When the instruction unit is a branch 

! 
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instruction, the instruction decoding unit 4006 sends the 
PC relative value 4 007 to the lower PC calculator 4 008 and 
the upper PC calculator 4009. 

The label table 4011 is a table storing each label 
name asspciated with a corresponding instruction address. 
This label table 4 011 is generated by extracting 
iiifoiTuatior* from th^ cptixrJ.zcd cede v;hen the assembler 
described in the first embodiment generates machine 
language codes. 

Inj Fig. 41, the address "32 * hOOOOOOOO" corresponds to 
the label f, the address "32 'hOOOOOOOB" corresponds to the 
label Ll', and the address "32 'hl2345580" corresponds to the 
label L3. 

The display unit 4012 displays the results of a 
disasseinbling of an instruction. 

The instruction replacing unit 4013 writes the 
instruction that has been replaced into the instruction 

unit(s) ;in the instruction buffer 4005 that is/are 

i 

indicated by the in-packet address specified by the in- 
packet ^ddress specifying unit 4002, 

The instruction writing unit 4014 rewrites the 
instruction packet in the instruction memory 4004 with the 
packet address specified by the packet address specifying 
unit 4001' using the amended Instruction packet stored in 
the instruction buffer 4005. 

The upper PC calculator 4 009 performs a calculation 
on the upper 29 bits of the instruction address specified 
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m. 

I 
I 

by the packet address specifying unit 4 001 and the upper 29 
bits of the PC relative value 4007, 

The lower PC calculator 4 008 performs a calculation 
on the lower 3 bits of the instruction address specified by 
5 the in-p|acket address specifying unit 4002 and the lower 3 
bits of the PC relative value 4 007. The calculation 
methods used by these PC calculators is the s^-me that 
used when generating the object code. 

The following describes a specific example of the 
10 operation of the present disassembler. Fig. 42 is a 
flowchart showing the operating procedure of this 
disassembler . 

First, the input control unit 4 000 receives a command 
indicating the disassembling of an instruction and an input 

15 of the ajddress of the instruction to be disassembled. In 

this specific example, the input control unit 4000 receives 
"32'hOOOOOOla" as the instruction address (step S4100) . 

Next, the packet address specifying unit 4001 
specificis the packet address from the upper 29 bits of -the 

20 instruction address* The instruction reading unit 4003 

then re^ds the instruction packet with the specified packet 
address jfrom the instruction memory 4 004 and stores it in 
the instjruction buffer 4005. In this example, 
"29'h00q0OO03" is specified as the packet address, and the 

25 instruct^ion sequence "Id (r2),r0||bra 13 ' hlf ec | | add r2,r3" 
is stored in the instruction buffer 4005 (step S4101) . 
The in-packet address specifying unit 4002 then 

i 
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specifies the in-packet address from the lower 3 bits of 
the instruction address and informs the instruction 
decoding unit 4006 of the instruction unit that has the 
specified in-packet address. The instruction decoding unit 
4006 then extracts the indicated instruction unit from the 
instruct'ion buffer 4005. In this example, "3'bOlO" is 
specified as the in-^packet address and the instruction "bra 
13'hlfec" that is the second unit in the instruction buffer 
4005 is inputted into the instruction decoding unit 4006 
(step S4102) . 

The instruction decoding unit 4006 judges whether the 
inputted instruction is a branch instruction. In this 
example,! the inputted instruction "bra I3'hlfec" is a 
branch i^nstruction, so that this judgement is true (step 
S4103) . ; 

When the instruction is a branch instruction, a 
calculation is performed on the PC relative value 4007 
indicated in the instruction and address of the inputted 

instructjion - The lower PC calculator 4 008 performs an 

additiori or a subtraction on the in-packet address of the 
inputted instruction and on the lower 3 bits of the PC 
relative value 4 007 and sends the calculation result to the 
label search unit 4010. The upper PC calculator 4009 
perform^ an addition or a subtraction on the packet address 
of the inputted instruction and on the upper 29 bits of the 
PC relative value 4 007 and sends the calculation result to 

the label search unit 4010. The label search unit 4010 

1 
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specifies the address of a label from the calculation 
result for the upper bits and the calculation result for 
the lower bits. In this example, the label address 
"32 'hOOOpOOOS" is specified by a calculation using the 
address ^'32 'hOOOOOOla" of the inputted instruction and the 
PC relative value 4007 "13'hlfec" (steps S4103, S4104) , 

The label search unit 4010 then refers to the label 
table 4 011 and finds the label name that has the specified 
address. In this example, the label LI corresponds to the 
address ;"32 ' hOOOOOOOS" (Step S4107). 

The display unit 46l2 displays the assernbler name of 
the branch instruction and the label name found by the 
label search unit 4010. In this example, the display unit 

4 012 displays the assembler name '*bra" of the branch 

i 

instructiion and the corresponding label name "Label Ll" 
(Step S4jl08) . 

The instruction decoding unit 4006 has the display 

unit 40li2 display only the assembler name when the 

1 

extracted instruction is not an assembler instruction (Step 
S4109) . \ 

Thje following describes a specific example of the 
operation of the present debugger. 

Fits. 43 is a flowchart showing the operation of the 
present ;debugger. 

Fiirst, the input control unit 4000 receives a command 
indicatijng the debugging of an instruction, the address of 
an instifuction to be replaced, and the instruction to be 
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used to replace of this instruction. In this specific 
example, the input control unit 4000 receives 
"32'hOOO'OOOla" as the instruction address and the 
subtraction instruction "s\ib rO,rl" as the replacement 
instruction (step S4200) . 

Next, the packet address specifying unit 4001 
specifies the packet address from the upper 2 9 bits of the 
instruction address. The instruction reading unit 4 003 
then reads the instruction packet with the specified packet 
address from the instruction memory 4004 and stores it in 
the instruction buffer 4005. In this example, 
"29'h00000003" is specified as the packet address, and the 
instruction sequence "Id (r2),r0||bra 13 'hlf ec | | add r2,r3'* 
is stored in the instruction buffer 4005 (step S4201) . 

The in-packet address specifying unit 4 002 then 
specifiers the in-packet address from the lower 3 bits of 
the instruction address. In this example, the in-packet 
address ,"3'b010" is specified (step S4202) . 

If the specified in-packet address is "3'bOOO", the 
first unit in the instruction packet in the instruction 
buffer 4005 is replaced with the inputted replacement 

i 

instruction. If the specified in-packet address is 
**3'b0lO"^ the second unit in the instruction packet in the 
instruction buffer 4005 is replaced with the inputted 
replacement instruction. If the specified in-packet 
address is "3*bl00", the third unit in the instruction 
packet ip the instruction buffer 4 005 is replaced with the 
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inputted replacement instruction. In this example^ the 
specified in-packet address is "3'bOlO", so that the 
instruction "bra 13'hlfec" in the second unit in the 
instruction packet. in the instruction buffer 4 005 is 
replaced with the inputted replacement instruction "sub 
rCrl", As a result, the instruction packet in the 
instruction buffer 4005 becQines ^Id (r2),rOll3ub rCrllladd 
r2,r3" (isteps S4203 - S4207}, 

The instruction writing unit 4014 replaces the 
instruction packet at the indicated packet address in the 
instruction tnemory 4004 with the instruction packet stored 
in the instruction buffer 4005, In this example, the 
instruction packet "Id (r2),rOMbra 13 ^ hlf oc | i add r2,r3^ at 
the packet address "29 ' h00000003" in the instruction memory 
4004 is ireplaced with the instruction packet "^Id 
(r2) ,rOl;|sub rO,rll!add r2,r3" in the instruction buffer 
4005. \ 

As| described above, the disassembler of the present 
embodiment can disaseemble instructions that are executable 
for the processor 30 9 of the first embodiment. When an 
instruction is disassembled, instead of just displaying the 
PC relative value, the disassembler has the upper PC 
calculatior and lower PC calculator calculate the address at 
which the label is located, uses the address to search the 
label table, and so displays the appropriate label name, 

Tl:^e debugger of the present embodiment reads 
instructions from the memory in units of instruction 
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packets that are byte-aligned, rewrites an instruction in 
the inst^ruction buffer, and writes the instructions back 
into the memory in units of instruction packets. This 
method is suited to the debugging of instructions that are 
not byte-aligned. 

Note that the calculation methods used by the lower 
PC calculator and the upper PC calculator do not need to be 
the carry method described in the first embodiment, so that 
another method, such as a separation method, an absolute 
value method, or a linear method, can be used. 

The compiler, optimization apparatus, assembler, 
linker, processor, disassembler, and debugger of the 
present linvention have been explained by way of the first 
to eight^h embodiments of the present invention, though it 
should be obvious that the present invention is not limited 
to these. Two example modifications are given below. 

(1) In tjhe first to sixth embodiments, the asaemblejr code 
302, the optimized code 304, the relocatable codes 306, and 
the object code 308 may be stored in a mask ROM, a 
semiconductor memory such as flash memory, a magnetic 
storage mediiam such as a floppy disk or a hard disk, or an 
optical disc such as a CD-^ROM or DVD. 

(2) In the seventh embodiment, the assembler codes 2 906 may 
be stored in a mask ROM, a semiconductor memory such as 

107 



flash meinory, a magnetic storage medivun such as a floppy 
disk or a hard disk/ or an optical disc such as a CD-ROM or 
DVD- 

Although the present invention has been fully 
described by way of examples with reference to accompanying 
drawings!, it is to be noted that various changes and 
modifications will be apparent to those skilled in the art. 
Therefore, unless such changes and modifications depart 
from the scope of the present invention, they should be 
construed as being included therein. 

i 
t 
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